The Official Google Blog says a bit about robots.txt, the file on your server that tells search engines how to access your site, what to index, and much more. From the OGB:


However, you may have a few pages on your site you don’t want in Google’s index. For example, you might have a directory that contains internal logs, or you may have news articles that require payment to access. You can exclude pages from Google’s crawler by creating a text file called robots.txt and placing it in the root directory. The robots.txt file contains a list of the pages that search engines shouldn’t access. Creating a robots.txt is straightforward and it allows you a sophisticated level of control over how search engines can access your web site.

The OGB article starts on a detailed instructional guide to robots.txt, with more to come. [via Lifehacker]

Leave a Reply