Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
[4.4 -> 4.5 upgrade] garbled charset
Ok, I upgraded my installation yesterday as I had already done for several websites, but this time something went wrong. The site in question is in Italian, the character set is UTF-8, and since the upgrade all accents and hyphens are garbled. You can see for yourself:
http://papuasia.afasici.net/article/317/isolate-the-enemy
I don’t know if my hosting provider has changed something on the server and I have very little understanding of how the character set is managed in MySQL/PHP/HTML.
I need help :-)
Offline
#2 2012-11-06 01:24:15
- uli
- Moderator
- From: Cologne
- Registered: 2006-08-15
- Posts: 4,306
Re: [4.4 -> 4.5 upgrade] garbled charset
I’ve seen similar when I copied text from a PDF, and also with some German umlaut characters I then had to change to HTML entities. But this here is too much to come from PDF copying alone. And is contained in the source code so it’s not a font issue.
- When you post a new article, what happens to the characters in question therein?
- What value for “collation” do you see for the “textpattern” table? (In phpMyAdmin: Click the database name for your TXP installation, look for the line beginning with “textpattern”.)
- In phpMyAdmin, click “textpattern” in the left frame/column, then click the “Structure” tab. What values for “collation” do you see now for
Body
andBody_html
? - Now switch to the “Browse” tab for table “textpattern”: Are the weird characters contained in the database in both fields,
Body
andBody_html
?
In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links
Offline
Re: [4.4 -> 4.5 upgrade] garbled charset
uli wrote:
I’ve seen similar when I copied text from a PDF, and also with some German umlaut characters I then had to change to HTML entities. But this here is too much to come from PDF copying alone. And is contained in the source code so it’s not a font issue.
When you post a new article, what happens to the characters in question therein?
They display perfectly, all accents and hyphens work as they should
What value for “collation” do you see for the “textpattern” table? (In phpMyAdmin: Click the database name for your TXP installation, look for the line beginning with “textpattern”.)
utf8_unicode_ci
In phpMyAdmin, click “textpattern” in the left frame/column, then click the “Structure” tab. What values for “collation” do you see now for
Body
andBody_html
?
utf8_unicode_ci
Now switch to the “Browse” tab for table “textpattern”: Are the weird characters contained in the database in both fields,
Body
andBody_html
?
Yes. Garbled in both fields
Thank you very much!
Last edited by harri (2012-11-06 10:16:53)
Offline
#4 2012-11-06 13:24:39
- uli
- Moderator
- From: Cologne
- Registered: 2006-08-15
- Posts: 4,306
Re: [4.4 -> 4.5 upgrade] garbled charset
harri wrote:
They display perfectly, all accents and hyphens work as they should
So we can probably exclude that Textile is involved. Let’s see:
Yes. Garbled in both fields
Are the garbled characters the same in both fields, i.e. if you find a è
in the Body field, do you find it exactly like that also in the Body_html field? No hidden invisible characters, in any of the fields? (For testing that, put your cursor into a word before an occurrence and watch the cursor moving while you press the right arrow key.)
utf8_unicode_ci
That’s like it should be. And I don’t think that such conversions take place during a TXP update. But your site looks bi-lingual, do you happen to have MLP installed, maybe installed the necessary update? If so, please post in the MLP topic so the MLP devs are informed.
OK, back to fixing the problem: In case the garbled characters are painstakingly the same in both fields, I’d do a search and replace with phpMyAdmin, after backing up the database, of course. If you don’t know how to do that, just ask.
In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links
Offline
Re: [4.4 -> 4.5 upgrade] garbled charset
Hi Harri.
Check if the rvm_latin1_to_utf8 plugin helps you.
Just remember to back up your database before using the plugin.
Offline
#6 2012-11-06 13:47:14
- uli
- Moderator
- From: Cologne
- Registered: 2006-08-15
- Posts: 4,306
Re: [4.4 -> 4.5 upgrade] garbled charset
Didn’t know that plugin, Julían, thanks. I just tested it in a sandbox, and the mentioned è
character pair remained unchanged.
In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links
Offline
Re: [4.4 -> 4.5 upgrade] garbled charset
Hi Uli.
Was your sandboxed DB on latin1 to begin with?
Offline
#8 2012-11-06 23:49:49
- uli
- Moderator
- From: Cologne
- Registered: 2006-08-15
- Posts: 4,306
Re: [4.4 -> 4.5 upgrade] garbled charset
That’s a legitimate question. No, it was like harri’s a utf8_unicode_ci. But your call made me read Ruud’s post more carefully this time, and I saw that his plugin is only for DBs created with MySQL < 4.1, so it’s no use simply “reverting” my newer DB to latin1.
But harri will perhaps remember in which year the initial installation took place. It’ll be not too difficult to find out whether it might have been one of the affected MySQL versions back then.
In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links
Offline
Re: [4.4 -> 4.5 upgrade] garbled charset
What if you add AddDefaultCharset UTF-8
in your htaccess?
Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.
Offline
Re: [4.4 -> 4.5 upgrade] garbled charset
Hello everybody and thank you very much for your help!
The database was indeed very old, this site was setup around 2003 and since then I changed several hosting providers. I’ve been with the current hosting for two or three years now. However I tried both rvm_latin1_to_utf8 and the .htaccess solution, which did not work, then I simply replaced the weird characters in a dump of the textpattern database and reimported it with phpmyadmin.
Rough solution, if you want, but it worked :-)
I also realized that I still had a bunch of old plugins active but no longer in use, things like rss_suparchive and chh_article_custom (no MLP though), so maybe this could have been a factor.
However, it is all solved now. Thank you very much for your help guys!
Offline
#11 2012-11-07 13:03:50
- uli
- Moderator
- From: Cologne
- Registered: 2006-08-15
- Posts: 4,306
Re: [4.4 -> 4.5 upgrade] garbled charset
harri wrote:
I simply replaced the weird characters in a dump of the textpattern database and reimported it with phpmyadmin.
Rough solution, if you want, but it worked :-)
Glad it worked for you!
This is no afterwards criticism, just a word of warning to those looking for a solution to their own character problems:
I’d not recommend this method cause you never know what character pairs exactly are used in other tables, e.g. in plugins or from additional script installations sharing the same database. I’d have done a search/replace from inside phpMyAdmin, and only on these tables that are affected. And once again: Backup before!
In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links
Offline