Control how search engines access and index your site

by Sam Glover on January 29, 2007

The Official Google Blog says a bit about robots.txt, the file on your server that tells search engines how to access your site, what to index, and much more. From the OGB:

robots.txt

However, you may have a few pages on your site you don’t want in Google’s index. For example, you might have a directory that contains internal logs, or you may have news articles that require payment to access. You can exclude pages from Google’s crawler by creating a text file called robots.txt and placing it in the root directory. The robots.txt file contains a list of the pages that search engines shouldn’t access. Creating a robots.txt is straightforward and it allows you a sophisticated level of control over how search engines can access your web site.

The OGB article starts on a detailed instructional guide to robots.txt, with more to come. [via Lifehacker]

FREE Lawyerist Insider Newsletter
Receive free advice on marketing, practice management, legal technology, and careers with our email newsletter, the Lawyerist Insider.
Name: 
Email: 
 

Sam Glover is a business and consumer rights lawyer and the creator of Lawyerist.

Leave a Comment

When you post a comment on this blog, you grant us the right to modify or delete your comment, but we have no duty to do so.

 Subscribe to the FREE Lawyerist Insider Newsletter 

Previous post:

Next post: