Robots.txt and your Website

March 24, 2008

I was stumbling through Google Webmaster Tools earlier today and I noticed that they have a neat tool to help you when it comes to analysing your Robots.txt file. To get to this tool perform the following steps…

  1. Login to Google Webmaster tools
  2. Click on the website you’d like to analyse
  3. Click on “Tools” on the left hand side of the page
  4. Click on “Analyse Robot.txt

It doesn’t matter whether your website is built using WordPress, Durpal, SalesForce.com or one of many others because a Robots.txt file can be installed on any website. Usually a good rule of thumb is to install it in the root directory of the website making it easy for spiders from search engines to crawl through your website.

You can tell which content you want search engines to look at and which content you don’t want them to look at by modifying this file.

Here are some modifications you may find helpful…

Disallow Way Back Archiving

User-agent: ia_archiver

Disallow: /

Allow Adsense Bot on the entire website

User-agent: Mediapartners-Google*

Disallow:

Allow: /*

Disallow Files ending with certain extensions

User-agent: Googlebot

Disallow: /*.php$

Disallow: /*.js$

Disallow: /*.inc$

Disallow: /*.css$

Disallow: /*.gz$

Disallow: /*.wmv$

Disallow: /*.cgi$

Disallow: /*.xhtml$

Disallow files in certain directories

User-agent: *

Disallow: /cgi-bin/

Disallow: /z/j/

Disallow: /z/c/

Disallow: /stats/

Disallow: /dh_

Disallow: /about/

Disallow: /contact/

Disallow: /tag/

Disallow: /wp-admin/

Disallow: /wp-includes/

Disallow: /contact

Disallow: /wp-

Disallow: /feed/

Disallow: /trackback/

Popular User Agents

* - all user agents

GoogleBot – crawls website and news pages

GoogleBot-Mobile – crawls mobile index

GoogleBot-Image – crawls image index

Mediapartners-Google – crawls pages to determine adsense content

AdsBot-Google – crawls pages measure adwords landing page quality

Yahoo! Slurp – crawls pages for Yahoo!

So when you go to edit your Robots.txt file remember that you can tell the Robots that are sent out by the Search Engines what to search for.

Related Articles:

Comments

Got something to say?





*
To prove you're a person (not a spam script), type the answer to the math equation shown in the picture. Click on the picture to hear an audio file of the equation.
Click to hear an audio file of the anti-spam equation