PHP: HEAD Request (cURL)

How to send a HTTP HEAD request using cURL in PHP.

4721 views
d

By. Jacob

Edited: 2021-04-12 08:42

PHP tutorial

When a server receives a HEAD request, it should only return the response headers of the given resource. If content is still returned (Aka. a response body), it should be ignored by clients.

Clients may send a HTTP HEAD request to check if a resource has been updated by comparing the response headers with a timestamp of a cached copy. If the cached copy is outdated, it will typically be invalidated, and a fresh GET request for the resource will be performed.

When a server responds to a HEAD request, the body part of the response should be excluded. While this is mostly useful for caching mechanisms, it is also useful to developers while testing request and response headers in their applications.

In PHP, you can send a HEAD request through the cURL extension by setting the CURLOPT_NOBODY option to true; to have the response returned to you, the CURLOPT_RETURNTRANSFER should also be used:

$url = 'https://beamtic.com/Examples/ip.php';
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);

$response = curl_exec($ch);
if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
    exit();
}
curl_close($ch);

// Output the response:
echo $response;

Note. The CURLOPT_HEADER option is used to include the response headers in the response. Without it, the response will be empty.

Verbose information

The cURL library has an option to return verbose information about the request, which is very useful while debugging; this can be enabled by setting the CURLOPT_VERBOSE option to true:

curl_setopt($ch, CURLOPT_VERBOSE, true);

This allows you to view the request type, but it will also allow you to extract other useful details, such as info on the TLS handshake and SSL certificate.

How are HEAD requests used

A HEAD request is similar to a GET, but it specifically means that the server should only return the HTTP response headers of the requested resource, without including the response body; sometimes the body may be included anyway due to errors and carelessness, but clients will generally ignore it.

The main advantage of sending a HEAD request to a resource, is that a client will be able to compare the caching headers before deciding if the full resource should be requested.

Servers may inform clients about the supported request methods for a given resource in allow header, each method separated by a comma:

allow: GET, HEAD

According to rfc7231#section-7.4.1: If an unsupported request method is used by a client, the server should respond with a 405 Method Not Allowed status, and then it must include the allow header to show supported methods.

As you may have noticed, it is not all servers or web applications that uses the allow header properly.

The format of a HTTP response

In the HTTP protocol, response headers are sent before the response body, and will look like this in plain text:

HTTP/1.1 200 OK
Date: Fri, 22 Jan 2021 12:41:43 GMT
Server: Apache
Upgrade: h2
Connection: Upgrade, Keep-Alive
Vary: Accept-Encoding
Keep-Alive: timeout=5, max=100
Transfer-Encoding: chunked
Content-Type: text/plain; charset=utf-8

<p>Hallo World</p>

The response headers and the response body are separated by two pairs of CRLF (A carriage return + a line feed character).

In PHP, CRLF is represented with \r\n; you can output CRLF using the following:

echo "\r\n";

Each header is separated by a single CRLF, while the headers- and body parts of the response is separated by CRLFCRLF (the equivalent of \r\n\r\n in PHP).

The idea is to allow HTTP clients to check caching headers such as last-modified and etag before deciding if a client-sided cache should be invalidated. If the cached copy is determined by a client to be outdated, the resource is typically re-downloaded with a fresh GET request. The main benefit of supporting HEAD is that the client avoids having to re-download resources that has not been changed, and at the same time servers also avoid wasting resources on generating and uploading resources that has already been downloaded by a client before.

It is far from all web-resources that support HEAD requests. In fact, if you are using a server-sided scripting language such as PHP, then you will have to manually add support for caching on dynamically generated pages — a good CMS system will already support client-sided caching without users having to do anything.

Parsing Response headers

In order to work with the response headers easily, it can be helpful to place them in an associative array; but, since the response headers will not be available to us when using cURL, you will first ned to cut them out of the response.

1. Obtain the headers after performing a request:

$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
$headers = substr($response, 0, $header_size);
$body = substr($response, $header_size);

Now you got the response headers and the response body stored in separate variables.

2. Now you can iterate over the lines in the $headers variable, creating an array in the process. Using the strtok function is probably the fastest way to do it:

// Define the $response_headers array for later use
$response_headers = [];

// Get the first line (The Status Code)
$line = strtok($headers, "\r\n");
$status_code = trim($line);

// Parse the string, saving it into an array instead
while (($line = strtok("\r\n")) !== false) {
    if(false !== ($matches = explode(':', $line, 2))) {
      $response_headers["{$matches[0]}"] = trim($matches[1]);
    }  
}

3. Since the headers are now stored as an associative array that uses the header-names as keys, you can now use isset to check if a given header was returned:

if (isset($response_headers['allow'])) {
  echo '<p>The <b>allow</b> header was present, here is its contents:';
  var_dump($response_headers['allow']);
  exit();
}

For more information about this subject, you may want to read: Parsing HTTP Response Headers

Links

  1. List of HTTP Status Codes

Sources

  1. rfc7231 section 7.4.1 - ietf.org

Tools:

You can use the following API endpoints for testing purposes:

https://beamtic.com/api/user-agent
https://beamtic.com/api/request-headers

Tell us what you think:

  1. In this Tutorial, it is shown how to redirect all HTTP requests to a index.php file using htaccess or Apache configuration files.
  2. How to create a router in PHP to handle different request types, paths, and request parameters.
  3. Tutorial on how to use proxy servers with cURL and PHP
  4. When using file_get_contents to perform HTTP requests, the server response headers is stored in a reserved variable after each successful request; we can iterate over this when we need to access individual response headers.
  5. How to effectively use variables within strings to insert bits of data where needed.

More in: PHP Tutorials