Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Character set problem with MySQL export/import
This is a continuation from the rvm_latin1_to_utf8 plugin thread.
I have a database on TextDrive which I have to export to the new Joyent servers using OpenSolaris.
I am seeing foreign characters messed up in the Solaris installation after export/import. Having tried many different things, I am still stuck.
For what its worth, I have this:
Old server:
MySQL 4.1.22 (but database was created on an earlier version of MySQL)
Table Collation: latin1_swedish_ci
New server:
MySQL 5.0.37
Table collation: utf8_general_ci
Imported via phpMyAdmin as a UTF8 file. Mangled characters….!!
Offline
Re: Character set problem with MySQL export/import
What if you export as latin1, change collation/charset in the dump file, then import as utf8?
Offline
Re: Character set problem with MySQL export/import
Hi Ruud. I tried that, as its part of the instructions for importing DB’s into Joyent (in the wiki). No joy.
Offline
Re: Character set problem with MySQL export/import
You think this error is caused by the fact that some data has been encoded in Latin1, and other data in UTF8? Considering that this site has a php.ini with default_charset=latin1?
The DB is now UTF8, but if the characters are garbled that must mean that they have been saved in Latin1?
Offline
Re: Character set problem with MySQL export/import
If the data on the old server looks correct, then it should look correct on the new server. If it doesn’t, then something wrong happens during the export/import process, but you’d have to document it precisely with every option you do/don’t select during export/import, because right now it’s near impossible to see what the problem is.
What would also help is having an example of how a piece of text looks on the old server (correct) and how it shows up on your website on the new server (corrupted).
Offline
Re: Character set problem with MySQL export/import
Thanks Ruud. Ok for the export using phpMyAdmin on Nicola, I use the default settings, without changing anything:
1. Select all
2. Structure:
Add AUTO_INCREMENT value
Enclose table and field names with backquotes (both checked)
3. Data:
Complete inserts
Extended inserts
Use hexadecimal for binary fields (all checked)
4. Export type: Insert
5. Compression: gzipped
Joyent says to change the charset from Latin1 to UTF8 before import.
For import to Joyent:
1. Character set of the file: UTF8
2. Partial import: Allow interrupt of import…. (checked)
That’s it, apart from opening Text Wranger to replace all instances of charset=“Latin1” with charset=“UTF8”.
I will try and make some images. What’s the best way to attach images here?
Offline
Re: Character set problem with MySQL export/import
Here are a couple of images that show the mangled text. First image is Nicola (current site), second image is Prospect after export/import. I also tried this with MAMP, and get the same mangled text.


Last edited by jstubbs (2007-12-05 07:34:36)
Offline
Re: Character set problem with MySQL export/import
The problem seems to be that the data that was originally UTF8 stored in a latin1 table, was treated as latin1 in a latin1 table and then converted to UTF8 in an UTF8 table. That’s why the those garbled characters are now 3 chars long instead of 1.
What is the dbcharset in config.php on the new server? Check if switching that from latin1 to utf8 or the other way around helps.
Offline
Re: Character set problem with MySQL export/import
Hi Ruud, thanks again for your help. dbcharset in config.php on Nicola is Latin1, while on the new server Prospect it is UTF8, which is because everything on the new Joyent OpenSolaris servers is UTF8 by default.
I changed the dbcharset to Latin1 on the new server – and it worked! I thought I already tried that!
So – why does it work and how (or should I?) can I change this, considering that we really should move to UTF8 on the new server???
Offline
Re: Character set problem with MySQL export/import
Still can’t access diagnostics page on the old server, but this is what I have on the new server after changing the dbcharset to Latin1:
Charset (default/config): latin1/latin1
character_set_client: latin1
character_set_connection: latin1
character_set_database: utf8
character_set_filesystem: binary
character_set_results: latin1
character_set_server: utf8
character_set_system: utf8
Offline
Re: Character set problem with MySQL export/import
Using PHPmyAdmin, check the character set of the varchar/text columns in the textpattern table. If those are still set to latin1 and assuming that, below that list of character_set_* lines you posted above, it shows that the tables are OK, you can use my plugin to migrate to UTF8.
Don’t erase that backup yet, which you now successfully imported on the new server.
You wondered why setting it to latin1 works:
I think your tables are still setup with latin1 character set. When storing UTF8 in a latin1 table, it doesn’t matter for US-ASCII characters, because they are the same in latin1 and UTF8. For the characters outside US-ASCII, like the ø, in UTF8 that’s one character, but it takes up 3 bytes, so when stored in a latin1 table (latin1 always uses 1 byte per character) that equals 3 characters. If textpattern treats a latin1 table as a UTF8 table (which isn’t correct) then those 3 characters show up as separate characters instead of being displayed as that one ø which it really represents.
Offline
Re: Character set problem with MySQL export/import
You mean on the old server or the new one?
On the new server, all the varchar/text columns are set to utf_general_ci.
Or did you mean on the old server?
Edit: On the old server, the textpattern table varchar/text columns are set to latin1_swedish_ci. Thats the original site db textpattern table.
Last edited by jstubbs (2007-12-05 19:23:02)
Offline