Validating user input, to prevent injection attacks in PHP

Validating user-input is one of the most important ways to increase security and prevent hacks; in this article, I will show how to easily validate GET and POST parameters, and prevent your app from being abused to send spam.

1809 views
d

By. Jacob

Edited: 2021-02-13 16:17

input validation php

Ideally you should validate user-input both on the front-end, using HTML and JavaScript, and on the back-ind, in our PHP code. But of course it is also fine to only do it in PHP.

There is a few ways you can validate input in PHP, the best way is probably to use the build-in filter functions for user-input, but we can also use regular-expressions (RegEx); however, using a RegEx is only recommended if you know what you are doing. If you do use a RegEx to validate user-input, then you may need to also account for language-specific characters.

Validating e-mail addresses is not an easy thing to do, and the only way to know if an e-mail is valid and active, is often to try and send an e-mail to it. Because of malicious bots and users that submit web-forms, validating e-mail addresses, before using them, can actually help prevent the waste of resources on processing invalid user input on the server-side.

You can use PHP's build-in filter_var function to validate an e-mail address. Note that it can be a good idea to validate user-input as early in your application as possible, preferably before loading the application itself, as doing so will save resources:

$email = '[email protected]';
if (false === filter_var($email, FILTER_VALIDATE_EMAIL)) {
    http_response_code(400);
    echo '<p>The email address was not valid.</p>';
    exit();
}

According to php.net, this will validate the e-mail address against RFC 822, with the exception that dotless domains, such as "localhost", are not supported.

Other ways to validate e-mail

The simplest type of e-mail validation simply checks if the the at (@) character exist in the string. This might prevent bogus data from getting submitted, but it will not prevent someone from abusing your subscription form to send unsolicited e-mails.

You can check if a string contains "@" like this:

$email = '[email protected]';
if (false === strpos($email, '@')) {
    http_response_code(400);
    echo '<p>The email address was not valid.</p>';
    exit();
}

A more sophisticated validation would employ a regular expression; in PHP this can be done with the preg_match function. It is important to understand, in case you are using a RegEx, the goal is probably not to allow every thinkable valid e-mail address, but rather to only allow those that you consider valid in your system. Here is a simple example:

if (false === preg_match("/^[a-z0-9-_\.]+@[a-z0-9-_\.]+\.[a-z-_\.]+$/i", $email)) {
    http_response_code(400);
    echo '<p>The email address was not valid.</p>';
    exit();
}

Validating form input

In PHP you will usually be accessing data from submitted forms through the $_POST superglobal — although this is not always advisable — for example, in an object orientated context it may make sense to have a superglobals class, read: Direct Access to Superglobals.

Regardless of how you access the form data, you should remember to carefully validate it before using it. This is almost regardless of the context, since you do not always know when someone will take that data, and use it somewhere that it can cause damage.

To avoid undefined notices you should first check if a variable is defined, before you attempt to validate its contents. This can be done efficiently with PHP's isset function:

if (false === isset($_POST['email'])) {
    http_response_code(400);
    echo '<p>Missing required parameter..</p>';
    exit();
}
if (false === filter_var($_POST['email'], FILTER_VALIDATE_EMAIL)) {
    http_response_code(400);
    echo '<p>The email address was not valid.</p>';
    exit();
}

In a PHP application, it is even easier if you predefine a list of parameters that you allow and expect to be used, and then simply run over this list with a function or method. The below example works for both $_POST and $_GET.

// First define allowed parameters (GET and POST)
// Required is indicated by true | false
$allowed_post_parameters = ['name' => true, 'email' => true];
$allowed_get_parameters = ['form_id' => false];

// Note that it is possible to combine GET and POST parameters,
// and sometimes it makes sense to mix them, so if needed, we can test both
if ('POST' === $_SERVER['REQUEST_METHOD']) {
  validate_parameters($_POST, $allowed_post_parameters);
}
// GET parameters can be present for all request types.
validate_parameters($_GET, $allowed_get_parameters);


/**
 *
 *  Function to check that used POST and GET parameters,
 *  are allowed, and to ensure that required parameters
 *  are included in the request before they are used.
 *
 */
function validate_parameters(array $used_parameters, array $allowed_parameters) {

  // Check that the used parameters are allowed
  foreach ($used_parameters as $key => $parm) {
    if (false === isset($allowed_parameters["$key"])) {
      http_response_code(400);
      echo '<p>Invalid request. Please do not fool around!</p>';
      exit();
    }
  }

  // Check that required parameters are used (defined)
  foreach ($allowed_parameters as $key => $required) {
    if ((true === $required)  &&  (false === isset($used_parameters["$key"]))) {
      http_response_code(400);
      echo '<p>Missing required parameter: '.$key.'</p>';
      exit();
    }
  }

}

When to validate e-mail addresses

The only reliable way to validate an e-mail address is to check if it exists by sending an e-mail to it and wait for user verification. Unfortunately, because the receiving e-mail servers tend to block servers that send them "too much" e-mail, doing so is not safe.

If you require e-mail validation as part of a ucer-creation process, then you risk having your SMTP server blocked by other e-mail servers, simply due to spam user-creations. The same problem happens with e-mail newsletters and public subscription forms. Applying some e-mail validation before trying to send an e-mail will help to mitigate the risk of getting blocked by other servers, but it is not the only step you need to take.

In addition to validating e-mail addresses, you also need to keep a record of the e-mail addresses that you have already contacted. If a user needs to verify their subscription or their user account creation, then you should refuse to send another e-mail to the same e-mail address until the user has confirmed that they requested your messages.

Still, this is not enough to prevent getting your SMTP blocked. Ideally, you should not send e-mails to unconfirmed users at all. Of course, that is not really realistic at this point, so the problem is more likely to be with the companies that operate the servers that receive our e-mail. It is a bad practice to block e-mails that are valid, but at the same time, there is just too much spam on the internet. It is very difficult to find a balance between allowing e-mail to go through, and trying to prevent unsolicited e-mail.

This is why it is also a good idea to validate e-mail using JavaScript. At least traditionally, bots would not run JavaScript, so it was an effective way to prevent spam subscriptions, and it still is in some cases. Nowadays you would probably require SMS verification rather than e-mail verification, but this can sadly be too expensive.

With user accounts, the question is if you need to validate them at all, because an e-mail address is no guarantee that a user is legit.

Links

  1. filter_var - php.net

Tell us what you think:

Oliver

Very insightful and easy understandable post on better input validaton, thanks a lot! I also just started using filter_var instead of traditional and more complicated methods (like you exampled in Other Methods) :)

Got some fresh ideas from your post, like using an array for validation for the expected params.

  1. In this Tutorial, it is shown how to redirect all HTTP requests to a index.php file using htaccess or Apache configuration files.
  2. How to create a router in PHP to handle different request types, paths, and request parameters.
  3. Tutorial on how to use proxy servers with cURL and PHP
  4. When using file_get_contents to perform HTTP requests, the server response headers is stored in a reserved variable after each successful request; we can iterate over this when we need to access individual response headers.
  5. How to effectively use variables within strings to insert bits of data where needed.

More in: PHP Tutorials