Select Page

Understanding Robots.txt

Are you using robots.txt? If you are a novice website developer, you are likely already perplexed by how sci-fi the Internet sounds, what with virtual spiders crawling everywhere. In short though, a robots.txt file is very easy to understand. You see, all it is a simple text file, which tells the virtual robots which search engines send out to analyses your site, where they have permission to go and as such what information to index and send back to search engines like Google.

what is robots.txt?But wait. Don’t website owners want search engines to analyse and index every available page on their website? Well, no, not all the time. Maybe for instance, you are testing out a new page layout and in doing so currently have two or more pages on your site with very similar content. This can be a problem because search engines don’t like duplicate content and will usually rank a website lower in search results for just this reason.

On the other hand, maybe you have a killer website upon which all your content is search engine optimised in a way that already identifies you in search result terms. As an aside then, you might have a forum where posts are not search engine optimised and actually bare very little relevance to your websites chief objective. In this case, you would use your robots.txt file to tell visiting spiders from Google and the like, not to bother reading through or indexing the pages in question.

Where Is Your Robots.txt File & How Do You Modify It?

Usually you will find your websites robots.txt file in the root directory of your domain, or in the case of websites build using WordPress, in the root directory of your WordPress installation. In the case of WordPress then, robots.txt files are very easy to modify by using dedicated SEO plugins. Alternatively, you will be able to find and manage your websites robots.txt file directly through your servers C Panel or equivalent, using a plain text editor such as NotePad in Windows, or Gedit in Linux.

For novice web developers, it’s probably a good idea to use a dedicated plugin to modify your robots.txt file. However, if you are editing your robots.txt file manually, you will simply need to identify specific search engine robots such as ‘Googlebot’ or ‘Bingbot’ (or use an asterix to idetify all) and type ‘Allow’ or ‘Disallow’ next to each of the pages which you want such robots to index or not.

Whatever you do though, always act with caution. An incorrectly edited robots.txt file can actually hide your entire website from search engine robots, leading to a disaster in terms of SEO. In this case, always make a backup of your original file and if in doubt, use Google’s robots.txt tool to make sure your website is correctly configured after editing.