Creating a MySQL Database With UTF8

How to create a MySQL Database with UTF-8 encoding.

5818 views
d

By. Jacob

Edited: 2020-07-26 17:26

MySQL, how to create a UTF8 database.

Working with the Character Set (Unicode or Latin-1) and encoding (UTF-8 or ISO-8859-1) of your app can be rather problematic. One way you can run into problems, is when Windows users use notepad to edit files, since it might mess with the encoding of the file and cause "invisible" problems that are hard to solve without specialized knowledge.

I have personally had .htaccess files that stopped working because of notepad messing with the encoding of my files. If you are on Windows, it can be a good idea to use another editor, such as Visual Studio Code, Eclipse or Notepad++.

Making sure your website – and browsers engaging with it – use the character encoding you intend all the time, can be difficult. For those interested, I included some more information about this later in the article!

For those who know the details:

CREATE DATABASE name_of_new_database CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Why using UTF-8 is recommended

UTF-8 is recommended for the web because it has support for all unicode characters, and thereby allow you to use language-specific characters without having to use HTML entity replacements. It also avoids problems with user input on HTML forms, as well as browsers different ways of handling characters that fall outside of whatever "limited" character encoding you might otherwise use – could be Latin-1 - ISO-8859-1)

.

If you do use UTF-8, and you still got problems with characters not displaying correctly, keep in mind, if there is a miss-match between the encoding of the data in your database, database-connection or static file and the specified encoding in the HTTP Content-Type header, certain letters will be replaced with gibberish characters (Aka Mojibake). Some people do not understand this problem, and so they simply recommend to use HTML entities like:

  1. æ = æ
  2. ø = ø
  3. å = å

This does not solve the problem! You still got a miss-match. This should be fixed!

It is not always as simple as creating a database with UTF-8. If you got a form in a HTML page, you will need to make sure the page is delivered using the right character set in the HTTP content-type header; doing this will make the browser submit the form as unicode. For HTML content, the header will look like this:

Content-Type: text/html; charset=utf-8

With PHP you can set the Content-Type using the header function like this:

header('Content-Type: text/html; charset=utf-8');

Note. It might be easier to configure your server to use UTF-8 as its default character encoding.

Alternatively, you can also use the meta HTTP equivalent meta tag in your HTML file:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

This is not recommended, since it might encourage a lazy author to not configure things correctly. It is also ignored if the HTTP header is present, so it can not be used to overwrite the header.

Finally, you should also make sure your database connection uses UTF8; In PHP, this may be done like this:

if (!$mysqli->set_charset("utf8mb4")) {
    echo 'Failed to set character Set';
    exit();
} 
echo 'Successfully set Character Set';

Links

  1. mysqli::set_charset - php.net
  2. Database Character Set and Collation - mysql.com
  3. UTF-8 Characters in CSS as Classes and IDs
  4. Count the Number of Characters in a Multi-byte String

Tell us what you think:

  1. How to configure phpMyAdmin with automatic login by setting auth_type to config.
  2. How to create new users in MySQL and control their permissions for better security.
  3. How to generate sitemaps dynamically using PHP.
  4. How to perform simple SELECT statements in SQL to communicate with SQL databases.
  5. The error happens when importing database backups using the SOURCE command, either because you got the path wrong, or because you used the command incorrectly.

More in: MySQL