Get The Full Requested URL in PHP
While there is no function to obtain the full request URL in PHP, we can still make our own using a combination of server variables to get what we need.
By. Jacob
Edited: 2021-02-27 13:29
Oddly, the full request URL is not directly accessible from PHP, since there seem to be no variable or function to reveal it; but, you can just make your own function to obtain it — to do that, you will need to use a combination of $_SERVER variables:
- HTTP_HOST
- REQUEST_URI
- HTTPS
To get the full requested URL you can use a combination of $_SERVER variables to "guess" the URL—a quick example of how to do this is included below:
$full_request_url = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http') . '://'. $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];
If the short if statement above is too hard to read, you can also code it like this:
$full_request_url = '';
if ((!empty($_SERVER['HTTPS'])) && ($_SERVER['HTTPS'] !== 'off')) {
$full_request_url .= 'https://';
} else {
$full_request_url .= 'http://';
}
$full_request_url .= $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];
Note. You will not be able to return the part after the hash "#" character, also known as the fragment part of the URL, since it is not sent to the server as part of the request by the client. The hash is primarily used for client-side navigation on subsections, and corresponds to a unique ID in the HTML on a web page. So, if the fragment part is important, you will need to use JavaScript to obtain it.
Function to get full request url
The HTTPS variable will be set to a non-empty value if the request was performed over the HTTPS protocol, knowing this, you can then use the empty function to check that the variable was not empty.
Since ISS on Windows might set the variable to "off" when HTTPS is not used, you will also need to check that the variable was not set to "off".
Normally HTTP requests on the web will go through port 80 (or 443 for HTTPS), so in most cases there is no need to include the port number; if for some reason you still need it, you can obtain it through the SERVER_PORT variable:
$full_request_url = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http') . '://'. $_SERVER['HTTP_HOST'] . ':' . $_SERVER['SERVER_PORT'] . $_SERVER['REQUEST_URI'];
You can easily create a function to return the full request URL whenever it is needed:
function full_request_url() {
return (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http') . '://'. $_SERVER['HTTP_HOST'] . ':' . $_SERVER['SERVER_PORT'] . $_SERVER['REQUEST_URI'];
}
If you are in an object orientated context, you may want to consider avoiding the use of super globals directly: Avoid Direct Use of PHP Superglobals
Sanitizing client-controlled variables
Some of the server variables are controlled by the client, and can be manipulated by malicious users. However, this should not matter—for the most part—unless you use it in a sensitive place, such as in a database, or even just to generate absolute paths for links on your page!. Please do not use absolute paths—it is much easier to maintain your site with root-relative URLs, and you will avoid the issue of cache poisoning of your links. See also: Absolute and Relative Paths
For example, if the REQUEST_URI contains invalid characters, it probably results in a 404 error response being sent to the user/client. Only if you insert/use the data in a sensitive place should validation be necessary.
Likewise, the HTTP_HOST variable may be manipulated, but doing so will probably just result in your web server returning the wrong website (if you use virtual hosting). However, you should still be careful, because some servers will simply serve a "default" virtual host, and it might fall back to your PHP application!
While it does not usually cause any problems if someone manipulates the host variable, it might still be a problem due to cache poisoning attacks. If a cache server is storing copies of your HTML pages, and an attacker somehow manages to inject a different host into your links, the attacker could successfully redirect all- or parts of your traffic to their own malicious website.
As a result of the complexity involved, it is probably best if developers always validate these variables. You can not expect that others who might be working on the code has the same overview of the application as yourself.
To validate the HTTP_HOST variable in your PHP application, you should maintain an array of known hosts, and simply use the value from the array rather than from the HTTP_HOST directly. This is a small inconvenience for achieving a bit of extra security.
Validating the REQUEST_URI is a bit more complex, and will also depend on your site's structure. You may opt to use a regular expression and then only allow certain valid patterns.
Tell us what you think: