Robots.txt File




The robots.txt file can be used to exclude bots from spidering specific sections of a site, to exclude particular bots from specific areas or to disallow named bots altogether.

Apart from privacy or bandwidth issues this file is useful to control the indexing of duplicate content such as when a CMS/shopping cart uses multiple file-paths to the same page.

While it's not necessary to use a robots.txt file spiders will request one, so if you don't have one you'll get a lot of 404s in your logs which can make identifying actual problems more difficult.

Care should be taken in compiling the file, I've read more than once of someone accidentally excluding Googlebot and then wondering why their pages fell out of the index.

For safety the file should be both validated and the format checked to ensure it does what its supposed to, Google Webmaster Tools has a handy one you can experiment with before uploading it.




For further reading on robots.txt we recommend:




    Secure your long-term SEO success by enlisting our seven years experience with Google.

© Copyright 2001 - 2008 DIY-SEO.com - Dun Laoghaire, Co. Dublin, Ireland