MySQL

MySQL, PHP, Apache, and UTF-8 Issues

UTF-8 is hell. I have run into problems with UTF-8 before (see Migrating MySQL to UTF-8 encoding).

I moved a site that was displaying fine from an old Apache server to a new Apache2 server and quickly identified that the Apache2 server was displaying odd characters.

The site is running a PHP application with a MySQL backend. The MySQL database is using latin1_swedish_ci character encoding. The old site was displaying correctly. Both the new and old site are using the same database.

So what was going on?

MySQL Error: Specified key was too long; max key length is 1000 bytes

When updating a MySQL database table to use UTF-8 I came across an error:

ERROR 1071 (42000) at line 1: Specified key was too long; max key length is 1000 bytes

Initial research indicates that the maximum key size in MySQL is 1000 bytes. Given that latin1 uses one (1) byte per character and UTF-8 uses three (3) bytes per character some indexes will exceed the 1000 byte limit. This accounts for the generated error when converting tables with large or compound indexes from latin1 to UTF8 encoding.

According to the MySQL bug tracker this issue has yet to be resolved. Currently the only solution is to limit indexing on UTF-8 encoded tables or use latin1 instead. Not the best of choices to say the least.

Migrating MySQL to UTF-8 encoding

While developing a content management system (CMS) using default MySQL settings an issue came to light in regards to UTF-8 encoding. Clients were sending in documents in Microsoft Word format that were encoded with UTF-8. When the data was copied from the document and pasted into the CMS WYSIWYG editor strange characters would be displayed after saving the document.

A second issue was identified where exported MySQL tables that contained UTF-8 encoded characters were being not being read correctly by Debian. Note that I am using an old installation of Debian (fully updated of course) that was rolled out before widespread adoption of UTF-8. Newer installations may already have UTF-8 enabled by default.

The goals of this HOWTO are as follows:

  1. set newly created tables in MySQL to use UTF-8 encoding
  2. convert existing MySQL tables to UTF-8
  3. set default environment (command line) encoding to use UTF-8

Automated MySQL Backups

The HOWTO will show how to automate the backup of your local MySQL server on a daily basis. The backup process uses a bash script and the local logrotate functionality to achieve an automated systems. The backup is processed daily via cron (see /etc/cron.daily/logrotate).

While this HOWTO is written for a local development environment it could easily by modified for remote multi-server environments.

Step1: Create the backup script

Create a file at /usr/local/sbin/backup_mysql.sh using your favorite text editor. I use joe:

Syndicate content