Archive for May, 2008
Change collation on all columns of a database
Sunday, May 25th, 2008It was recently required for me to change the collation of each and every column of every table in a database from ‘latin1′ to ‘utf8′. Although the table collations were correct, the column collations were incorrect. It’s a cumbersome process to perform manually, and there’s apparently no real automated way to do it without a script. Although collation information is only meta-data, not actual data, I found this problem interesting.
Changing one column collation information is easy enough to do with one MySQL query:
ALTER TABLE `moods` CHANGE `mood_label` `mood_label` text CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Changing all the columns is more difficult. Here’s a small script that I came up with to do it recently:
dionyziz@orion:~$ mysqldump -u root --password=1234 \ --no-data --no-create-db --compact ccbeta \ |egrep 'CREATE TABLE|latin1' \ |sed 's/CREATE TABLE `\(.*\)` (/;ALTER TABLE `\1`/' \ |sed 's/character set latin1/CHARACTER SET utf8 COLLATE utf8_unicode_ci/' \ |sed 's/ `\(.*\)`/ CHANGE `\1` `\1`/'>columns dionyziz@orion:~$ php -r 'file_put_contents( "columns", preg_replace( "#^;|ALTER TABLE `.*`(\\s*;|$)#", "", preg_replace( "#,(\\s*);#", ";\\1", file_get_contents( "columns" ) ) ) );' dionyziz@orion:~$ mysql -u root --password=1234 ccbeta <columns
Let’s go through it step-by-step.
mysqldump -u root --password=1234 --no-data --no-create-db --compact ccbeta
This creates a list of CREATE TABLE statements for all our tables. That’s good because it’ll allow us to determine whether the collation of a column is incorrect. Here’s an example CREATE TABLE statement:
CREATE TABLE `albums` ( `album_id` int(11) NOT NULL auto_increment, `album_userid` int(11) NOT NULL default '0', `album_created` datetime NOT NULL default '0000-00-00 00:00:00', `album_name` text character set latin1 NOT NULL, `album_description` text character set latin1 NOT NULL, PRIMARY KEY (`album_id`), KEY `album_userid` (`album_userid` ) ) ENGINE=MyISAM AUTO_INCREMENT=55 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
In this example, the `album_name` and `album_description` columns are wrong and need their collations changed.
egrep 'CREATE TABLE|latin1'
This simple line limits our results to only lines that contain “CREATE TABLE” or “latin1″. That’s useful since it’ll only show the table names followed by a list of all incorrectly collated columns, if any. The result would be something like this:
CREATE TABLE `relations` ( CREATE TABLE `searches` ( `search_query` text character set latin1 NOT NULL, CREATE TABLE `shoutbox` ( `shout_text` text character set latin1 NOT NULL, `shout_delreason` text character set latin1 NOT NULL,
(with more entries potentially)
Good. Now all we need to do is modify these lines to make them ALTER TABLE lines:
sed 's/CREATE TABLE `\(.*\)` (/;ALTER TABLE `\1`/'
Ah, the magic of regular expressions. This removes the final “(” of every CREATE TABLE line, as we don’t need it and also changes the word “CREATE” into “ALTER”. It also adds a semicolon in front of the ALTER TABLE statement (to terminate the previous statement).
sed 's/character set latin1/CHARACTER SET utf8 COLLATE utf8_unicode_ci/'
Straightforward enough, this replaces the existing character set instruction from latin1 to utf8, and adds the correct collation as well.
sed 's/ `\(.*\)`/ CHANGE `\1` `\1`/'
Finally, this adds the word “CHANGE” in front of every column line and repeats the column name (as we want to tell MySQL which column to change (first repetition) and to which to change it (second repetition)). The result is:
;ALTER TABLE `relations` ;ALTER TABLE `searches` CHANGE `search_query` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, ;ALTER TABLE `shoutbox` CHANGE `shout_text` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, CHANGE `shout_delreason` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
Pretty close to what we actually want. You’ll notice three problems:
- There are empty ALTER statements
- There’s an extra comma at the end of every column (providing all your tables have a primary key, as they should)
- There’s a redundant semicolon at the beginning
These problems cannot easily be fixed by sed because sed performs a line-to-line processing. A sed expert might have been able to provide us with a better solution, but I’ll prefer to use the PREG feature of PHP. To use PHP, first let’s save our current result into a file:
>columns
Time to run our PHP code on the target file:
php -r 'file_put_contents( "columns",
preg_replace( "#^;|ALTER TABLE `.*`(\\s*;|$)#", "",
preg_replace( "#,(\\s*);#", ";\\1",
file_get_contents( "columns" ) ) ) );'
Let’s analyze it in short.
file_get_contents( "columns" );
This, simply enough, reads the “columns” file into memory. Now we’ll perform two regular expression replacements:
First, we’ll match the following regular expression:
#,(\s*);#
(notice that the # are separators that wrap the regular expression for clarity — they aren’t part of the actual regular expression)
Anything matching this will be replaced by ;\1. This means that a comma followed by any whitespace (including a new line) followed by a semicolon will be replaced by only a semicolon (and the same whitespace). This simply removes the redundant comma at the end of every ALTER statement.
Second, we’ll match the following:
#^;|ALTER TABLE `.*`(\\s*;|$)#
Anything matching will be removed. You’ll notice that this regular expression matches basically two things (separated by the first alternation (pipe) character).
The first part is:
#^;#
It’ll remove the first line if it only contains a single semicolon (which it does in our example).
The second part is:
#ALTER TABLE `.*`(\\s*;|$)#
This will look for empty ALTER TABLE statements (an ALTER TABLE statement followed only by whitespace and a semicolon or an end-of-file) and remove them.
Finally, we’ll write the result back to the file we read from:
file_put_contents( "columns", ... );
Now if we cat that file we’ll see that it contains all ALTER statements in the form we want them:
ALTER TABLE `searches` CHANGE `search_query` `search_query` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL; ALTER TABLE `shoutbox` CHANGE `shout_text` `shout_text` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, CHANGE `shout_delreason` `shout_delreason` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL;
Excellent. Finally, let’s execute it:
mysql -u root --password=1234 ccbeta <columns
You can also add ‘time’ in front of it to measure how long it’ll take. We can now validate that the collations were changed successfully by, again, performing our initial dump and grepping for ‘latin1′ to confirm that there are none.
Accessing MySQL using a SSH-Tunnel
Wednesday, May 21st, 2008Due to security aspects MySQL server often only listen on localhost and accessing them remotely is not possible. With a simple trick you can access them with your favourite tool aswell.
ssh -l login_name -N -L [bind_address:]port:host:hostport remoteServer
login_name is your username on the remote machine
bind_address is the local interface’s address where the tunnel should bind to
port is the local port the tunnel should bind to
host is the remote host to which you want to connect
hostport is the remote port to which you want to connect
remoteServer is the server using which you want to etablish the tunnel
-N should disable starting the login shell
So for creating a tunnel for MySQL the command looks like this:
ssh -l user-remote -N -L 123456:localhost:3306 remoteServer
This create a tunnel to MySQL running on remoteServer. You are able to access MySQL on port 123456 on localhost. For example you can use the following command:
mysql --port=123456 -h 127.0.0.1 --user=mysqlUser --password=mysqlPassword database