Home

Share via:

PHP: Get The Full Requested URL

While there is no function to obtain the full request URL in PHP, we can still make our own using a combination of server variables to get what we need.

112 views

Edited: 2020-05-30 00:57

Full request URL, PHP

Oddly, the full request URL is not directly accessible from PHP, since there seem to be no function for it. But, we can still make our own function to return the most important parts of the URL.

We will not be able to return the part after the hash "#" character, also known as the fragment part of the URL, since it is not sent to the server as part of the request by the client. The hash is primarily used for client-side navigation on subsections, and corresponds to a unique ID in the HTML on a web page. So, if the fragment part is important, you will need to use JavaScript to obtain it, which is not going to be covered in this tutorial.

To get the full requested URL we can use a combination of $_SERVER variables to "guess" the URL—a quick example of how to do this is included below:

$full_request_url = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http') . '://'. $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];

If the short if statement above is too hard to read, you can also code it like this:

$full_request_url = '';
if ((!empty($_SERVER['HTTPS'])) && ($_SERVER['HTTPS'] !== 'off')) {
  $full_request_url .= 'https://';
} else {
  $full_request_url .= 'http://';
}
$full_request_url .= $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];

Function to get full request url

The HTTPS variable will be set to a non-empty value if the request was performed over the HTTPS protocol, knowing this, we can then use the empty function to check that the variable was not empty.

In order to make the code portable, since ISS on Windows might set the variable to "off" when HTTPS is not used, we will also need to check that the variable was not set to "off".

Normally HTTP requests on the web will go through port 80 (or 443 for HTTPS), so we do not normally need to include the port number. If for some reason we need the port number, we may use a bit of additional code. The REMOTE_PORT variable should work:

$full_request_url = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http') . '://'. $_SERVER['HTTP_HOST'] . ':' . $_SERVER['REMOTE_PORT'] . $_SERVER['REQUEST_URI'];

You can easily create a function to return the full request URL whenever it is needed:

function full_request_url() {
  return (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http') . '://'. $_SERVER['HTTP_HOST'] . ':' . $_SERVER['REMOTE_PORT'] . $_SERVER['REQUEST_URI'];
}

If you are in an object orientated context, you may want to consider avoiding the use of super globals directly: Avoid Direct Use of PHP Superglobals

Sanitizing client-controlled variables

Some of the server variables are controlled by the client, and can be manipulated by malicious users. However, this should not matter—for the most part—unless you use it in a sensitive place, such as in a database, or even just to generate absolute paths for links on your page!. Please do not use absolute paths—it is much easier to maintain your site with root-relative URLs, and you will avoid the issue of cache poisoning of your links. See also: Absolute and Relative Paths

For example, if the REQUEST_URI contains invalid characters, it probably results in a 404 error response being sent to the user/client. Only if you insert/use the data in a sensitive place should validation be necessary.

Likewise, the HTTP_HOST variable may be manipulated, but doing so will probably just result in your web server returning the wrong website (if you use virtual hosting). However, you should still be careful, because some servers will simply serve a "default" virtual host, and it might fall back to your PHP application!

While it does not usually cause any problems if someone manipulates the host variable, it might still be a problem due to cache poisoning attacks. If a cache server is storing copies of your HTML pages, and an attacker somehow manages to inject a different host into your links, the attacker could successfully redirect all- or parts of your traffic to their own malicious website.

As a result of the complexity involved, it is probably best if developers always validate these variables. We can not expect that others who might be working on the code has the same overview of the application, and indeed, we might also make a mistake ourselves.

To validate the HTTP_HOST variable in your PHP application, you should maintain an array of known hosts, and simply use the value from the array rather than from the HTTP_HOST directly. This is a small inconvenience for achieving a bit of extra security.

Validating the REQUEST_URI is a bit more complex, and will also depend on your site's structure. You may opt to use a regular expression and then only allow certain valid patterns.

Comments