PHP: Convert a Relative URL to an Absolute URL

How to turn a relative URL into an absolute URL while also handling dot segments in the URL.

2749 views
d

By. Jacob

Edited: 2020-05-29 19:25

Relative URL to Absolute URL, PHP

To build an absolute URL from a relative URL we need to know the request protocol and the host name, which we can then prepend to the relative URL to produce an absolute URL.

We also need to resolve .. (double dot notation) in URLs. The double dot means that a client should look for a file in a parent segment; each occurrence of ".." should move "one up" in the URL three. I recommend never to use double dots in URLs, since it can be confusing to some people. Nevertheless, our function should still be able to handle these segments when encountered.

See also: Absolute and Relative Paths

To also resolve ".." declarations in URLs, I have used preg_replace in a loop to recursively remove the relevant segments.

Solution:

function get_absolute_path($url) {
  // Determine request protocol
  $request_protocol = $request_protocol = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http');
  // If dealing with a Protocol Relative URL
  if (stripos($url, '//') === 0) {
    return $url;
  }
  // If dealing with a Root-Relative URL
  if (stripos($url, '/') === 0) {
    return $request_protocol . '://' . $_SERVER['HTTP_HOST'] . $url;
  }
  // If dealing with an Absolute URL, just return it as-is
  if (stripos($url, 'http') === 0) {
    return $url;
  }
  // If dealing with a relative URL,
  // and attempt to handle double dot notation ".."
  do {
    $url = preg_replace('/[^\/]+\/\.\.\//', '', $url, 1, $count);
  } while ($count);
  // Return the absolute version of a Relative URL
  return $request_protocol . '://' . $_SERVER['HTTP_HOST'] . '/' . $url;
}

This function only works for on-site URLs, and will need to be modified if you want to handle URLs found in external resources.

Handling URL segments

An URL consists of several different parts, but for the purpose of this tutorial, we are mostly interested in the path, which is the part after the host name (if present).

1. To construct an absolute URL, we will first need to to determine the request_protocol:

$request_protocol = (!empty($_SERVER['HTTPS']) && $_SERVER['HTTPS'] !== 'off' ? 'https' : 'http');

The above one-liner tries to establish the protocol used for the current request. Once this is done, it should be placed first in the URL. I.e.: https://[HTTP_HOST]/[path]

The remaining parts can be inserted directly when returning the finished URL.

2. In the next step, we need to make sure that we are not dealing with a Protocol Relative URL. This type of URL works similar to an absolute URL, just without specifying a protocol. If our function is given either an Absolute URL or a Protocol Relative URL, we should just return the URL as-is, without modifying it.

Note. When a protocol relative URL is opened, the client will use the same protocol used in the first request.

We can easily check this with the stripos function:

// If dealing with a Protocol Relative URL
if (stripos($url, '//') === 0) {
  return $url;
}

This checks if the first two characters is two forward slashes "//", and returns the $url intact, if the case.

3. Next, we check if the URL was root relative:

if (stripos($url, '/') === 0) {
  return $request_protocol . '://' . $_SERVER['HTTP_HOST'] . $url;
}

If the URL was root-relative, we need only to add the protocol and the host.

4. Now we check if the URL was already absolute, if so, we return it. To do this, we check if the URL begins with "http":

if (stripos($url, 'http') === 0) {
  return $url;
}

Of course, this would only account for URLs using the HTTP protocol. If we want to it to work with other protocols, we may use a regular expression instead.

6. Finally we handle relative URLs. Double dot notation ".." (also known as dot-segments) may be used in URLs to reach a parent segment in the URL. Each segment is seperated by a forward slash "/", similar to how directories are seperated on a hard disk partition.

Note. While it makes no sense to use double dots in absolute URLs, in some browsers, it does actually seem to work regardless. But, in this function, we will only handle dots included in relative URLs.

For this, we may use preg_replace in a do while loop:

do {
  $url = preg_replace('/[^\/]+\/\.\.\//', '', $url, 1, $count);
} while ($count);
return $request_protocol . '://' . $_SERVER['HTTP_HOST'] . '/' . $url;

The pattern matches a normal segment that is followed by a double dot segment. I.e.: something/../, and then removes them recursively until no more matches are found.

If the number of dot segments does not equal the number of normal segments, any remaining dot segments is removed.

Tell us what you think:

  1. In this Tutorial, it is shown how to redirect all HTTP requests to a index.php file using htaccess or Apache configuration files.
  2. How to create a router in PHP to handle different request types, paths, and request parameters.
  3. Tutorial on how to use proxy servers with cURL and PHP
  4. When using file_get_contents to perform HTTP requests, the server response headers is stored in a reserved variable after each successful request; we can iterate over this when we need to access individual response headers.
  5. How to effectively use variables within strings to insert bits of data where needed.

More in: PHP Tutorials