Creating a MySQL Database With UTF8
How to create a MySQL Database with UTF-8 encoding.
By. Jacob
Edited: 2020-07-26 17:26
Working with the Character Set (Unicode or Latin-1) and encoding (UTF-8 or ISO-8859-1) of your app can be rather problematic. One way you can run into problems, is when Windows users use notepad to edit files, since it might mess with the encoding of the file and cause "invisible" problems that are hard to solve without specialized knowledge.
I have personally had .htaccess files that stopped working because of notepad messing with the encoding of my files. If you are on Windows, it can be a good idea to use another editor, such as Visual Studio Code, Eclipse or Notepad++.
Making sure your website – and browsers engaging with it – use the character encoding you intend all the time, can be difficult. For those interested, I included some more information about this later in the article!
For those who know the details:
CREATE DATABASE name_of_new_database CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Why using UTF-8 is recommended
UTF-8 is recommended for the web because it has support for all unicode characters, and thereby allow you to use language-specific characters without having to use HTML entity replacements. It also avoids problems with user input on HTML forms, as well as browsers different ways of handling characters that fall outside of whatever "limited" character encoding you might otherwise use – could be Latin-1 - ISO-8859-1)
.If you do use UTF-8, and you still got problems with characters not displaying correctly, keep in mind, if there is a miss-match between the encoding of the data in your database, database-connection or static file and the specified encoding in the HTTP Content-Type header, certain letters will be replaced with gibberish characters (Aka Mojibake). Some people do not understand this problem, and so they simply recommend to use HTML entities like:
- æ = æ
- ø = ø
- å = å
This does not solve the problem! You still got a miss-match. This should be fixed!
It is not always as simple as creating a database with UTF-8. If you got a form in a HTML page, you will need to make sure the page is delivered using the right character set in the HTTP content-type header; doing this will make the browser submit the form as unicode. For HTML content, the header will look like this:
Content-Type: text/html; charset=utf-8
With PHP you can set the Content-Type using the header function like this:
header('Content-Type: text/html; charset=utf-8');
Note. It might be easier to configure your server to use UTF-8 as its default character encoding.
Alternatively, you can also use the meta HTTP equivalent meta tag in your HTML file:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
This is not recommended, since it might encourage a lazy author to not configure things correctly. It is also ignored if the HTTP header is present, so it can not be used to overwrite the header.
Finally, you should also make sure your database connection uses UTF8; In PHP, this may be done like this:
if (!$mysqli->set_charset("utf8mb4")) {
echo 'Failed to set character Set';
exit();
}
echo 'Successfully set Character Set';
Tell us what you think: