Share via:

How to Block or 404 out your Index

Preventing access to index.php to avoid duplicate content.


Edited: 2017-06-15 15:19

This tutorial shows how to show avoid duplicate content by showing a 404 error for your index.php file. For instance, you could own a website on the following domain, the Directory Index is usually named index.html by default. Most people know this, and often they will type in index.html in their browsers address bar, for whatever reason.

The threat is if someone starts to link to the index.html, or if search engines somehow get to know about the URL. There is a number of ways to prevent this, the first I'm going to mention uses the robots.txt, to disallow access to the page.

Using Robots.txt

Blocking search engines from indexing the index.html, from within the robots.txt, can be done fairly easy. Most major search engines recognise the robots.txt, and respect the rules you set inside of it. Example below:

User-agent: *
 Disallow: /index.html

Of cause you can also disallow access to specific search engines.

User-agent: *
 User-agent: Google
 User-agent: Yahoo
 Disallow: /

Using PHP

If you are using 6 on your site, then i do think that this is one of the best methods. We simply check to see if the requested path equals /index.php, this can be done with a simply PHP if statement. The $_SERVER['REQUEST_URI'] variable contains the requested path, as a root-relative text string, we can easily check that this doesn't equal /index.php, so simply include something like the below, somewhere in the top of your source.

if ($_SERVER['REQUEST_URI'] == '/index.php') {
  header('HTTP/1.1 404 Not Found');
  include_once '404.php';
  /* mysql_close($Connection); // include if you use MySQL */

The 404 error page is optional, but i do suggest that you make one. You can also read the article titled Creating a Custom 404 Error Page, there is a sample file included that you can use.