Robots.txt and Security

Potential security issues around using Robots.txt to block indexing of content in members sections.

733 views

Edited: 2017-11-28 23:31

You should not rely too heavily on robots.txt to prevent bots and users from accessing parts of your site. If there is something that you do not want to be discovered by search engines, then it is best to either not host it on your site at all, or implement a decent server-side security mechanism instead.

We can not assume all robots will adhere to the rules in the robots.txt file. Some may even choose to ignore the file entirely, while others might not understand the rules we specify. It is therefor best to use other security mechanisms on your site, to prevent access to content that you do not want to be accessible.

Robots.txt and Security

There are quite a few security issues with relying on Robots.txt to prevent indexing of content, none are however critical. Robots.txt is mainly useful if you want to control how the major known search engines will access your site – and not as a security mechanism.

In addition, listing secret directories in the robots text file, could inform hackers of otherwise unknown locations on your server. It is therefor important that you have other security mechanisms in place. Simply providing members of your site with a secret URL is rarely enough to prevent access from uninvited guests – especially not if you list this URL in your robots.txt file to prevent it from showing up in the search results.

Alternatives to robots.txt

You might opt to use a more robust access control instead of relying on, such as that available by using .htaccess and .htpasswd files. Alternatively, you can also serve parts of your site through a PHP script with password protection. Both should effectively prevent unauthorized access and indexing by search engines.

Tell us what you think:

  1. Drop in traffic doing recent helpful content updates; time will tell if I will recover.
  2. Let us investigate what is up with those mass spammed *-k.html backlinks that many of us are seeing in our link reports.
  3. An in-dept look at the use of headings (h1-h6) and sections in HTML pages.
  4. Pagination can be a confusing thing to get right both practically and programmatically. I have put a lot of thought into this subject, and here I am giving you a few of the ideas I have been working with.
  5. The Video outside the viewport is properly not worth spending time on solving; it is probably intended to solve a specific issue, and every single little video probably does not need to get indexed anyway.

More in: SEO