Share via:

Using Preg_match to Validate Input in PHP

How to properly validate input from PHP using preg_match, and avoid undefined notices.

29 views
d

By. Jacob

Edited: 2019-03-28 22:48

I have made it a mandatory process to validate user input and for this, the preg_match function is just amazing. However, this function does require the knowledge of regular expressions.

If you do not know how to write regular expressions, I highly recommend you take the time to learn it, since it is very valuable when working with text strings.

Validating input is however extremely complex. Ideally, we should always aim to support all valid input, because we do not always know when and if our code is going to be used in a broader context. For example, when internationalizing an app, you would need to also support Chinese characters. E-mail addresses is almost impossible to validate, and hence why you probably should not do it.

In PHP we can use the preg_match() function to validate input provided by users. This will normally be data sent as GET or POST requests, available in $_GET and $_POST. But it could also be data found in request headers sent by the browser. Injection attacks in headers are probably less common, but it is still important to guard against them.

Before data is inserted in a database, it is extremely important to validate the data. Simply making sure that the data is properly excaped is not enough, since invalid data located the wrong place might break your application.

Preg_match will return false when the input data does not match the pattern. To use preg_match, we can therefor construct a simple if, else statement like below:

if (preg_match("|^[a-z]+$|i"), $_GET['name']) {
  $name = $_GET['name'];  
}
echo "\n\n"; // A couple line breaks if testing from a terminal

echo $name . "\n\n"; 

Basic regular expressions

How exactly to acomplish different validations of input simply takes practice. But I will explain some basic patterns to help the novice RegEx ninja on his way.

The pattern I used in the above example |^[a-z]+$|i will only allow one or more letters to exist in the input. The i modifier simply means case-insensitive. If we leave it out, we can also allow capital letters by including them directly. I.e.: |^[a-zA-Z]+$|

The pipe (|) character simply indicate we are about to write a regular expression. You could also have used a forward dash (/), and it would still have worked. I.e.: /^[a-zA-Z]+$/

The caret (^) and the doller sign ($) marks the beginning and the end of the string you are validating. Including them is optional, and you can even opt to include just one of them -- depending on what you are trying to acomplish. Usually, it is a good idea to include them, since you effectivly validate the entire string.

Dealing with undefined index notices

Validating input in this manner normally works fine. But if the name index does not exist in the $_GET array, it might trigger a notice (not an error). To avoid this, we could simply check for the existence of the $_GET['name'] array index.

Wrapping the if statement in another if block will not work when we are also have an else clause. I.e.:

if (isset($_GET['name'])) { // Avoids PHP notice about undefined index

  if (preg_match("|^[a-z]+$|i"), $_GET['name']) {
    $name = $_GET['name'];
  } else {
    $name = ''; // Default value, will not be executed when $_GET['name'] is undefined
  }

  
}

Instead, we need to check both if $_GET['name'] is set and that it is valid in the same if statement. This can be easily done using a couple ampersands (the and operator) like shown below:

if ( (isset(_GET['name'])) && (preg_match("|^[a-z]+$|i"), $_GET['name']) ) {
  $name = $_GET['name'];
}

PHP first checks the first part of the statement, which in this case is the isset() call. The preg_match function is only called if the isset() part passes.

It should be mentioned that a PHP Notice message is not the same as an error, and we should therefor be able to safely ignore it in most cases.

Some PHP developers considers it good practice to even account for undefined notices, because there might be situations where you did not expect a variable to be empty, and in such cases it will make it easier to debug your script if you accounted properly for notices in your code.

A PHP Notice can look like this:

PHP Notice: Undefined index: name in /var/www/index.php on line 100

Comments