Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2013-06-20 18:36:51

alesh
Member
From: Miami, FL
Registered: 2005-04-13
Posts: 228
Website

Character encoding problem on re-import of textpattern database

Recently, my the main table in my textpattern install was deleted (apparently by a hacker). Come to find that the only backup I have is two years old. Ouch. I cut everything except that table out of the .sql file and re-imported it through PHPmyAdmin, and viola!

Only trouble is with extended characters like curly quotes and ® — they look like black diamond question mark symbols, e.g. here.

If I try to access these articles through the Textpattern backend, the body field is blank. If I enter anything there, it overwrites the article content.

I’d like to fix all the affected articles at once … by downloading the DB and doing something to it? A PHPmyAdmin command? Not really sure. Any help would be appreciated.


Yes, I have tried turning it off and on.

Offline

#2 2013-06-21 15:13:34

gaekwad
Server grease monkey
From: People's Republic of Cornwall
Registered: 2005-11-19
Posts: 4,134
GitHub

Re: Character encoding problem on re-import of textpattern database

Try this

Offline

#3 2013-06-21 16:19:57

uli
Moderator
From: Cologne
Registered: 2006-08-15
Posts: 4,303

Re: Character encoding problem on re-import of textpattern database

gaekwad wrote:

Try this

What’s mentioned there only at the end of the topic: Backup before.

I assume that, as long as you see these black diamond characters even in PHPmyAdmin, you won’t get a satisfying result, cause AFAIK all characters end up in the form of that substitute character, and PHPmyAdmin can’t decide which is which. You then might try a search/replace in the cutout of the old backup file.


In bad weather I never leave home without wet_plugout, smd_where_used and adi_form_links

Offline

#4 2013-06-21 20:03:42

alesh
Member
From: Miami, FL
Registered: 2005-04-13
Posts: 228
Website

Re: Character encoding problem on re-import of textpattern database

Ahh, but here’s the thing — in PHPmyAdmin everything looks good — I just looked up the article and where I see the substitute characters it’s got curly quotes and whatnot.

So am I trying to get the DB to spit them out correctly? Or to re-encode everything somehow? Or should I be looking for a way (as per the linked article) to replace everything with it’s equivalent escape sequence — single-curly-quote for ‘ … etc etc.? What’s the strategy?


Yes, I have tried turning it off and on.

Offline

#5 2013-06-22 00:33:01

maniqui
Member
From: Buenos Aires, Argentina
Registered: 2004-10-10
Posts: 3,070
Website

Re: Character encoding problem on re-import of textpattern database

If it looks right in PHPMyAdmin, then there is hope that characters haven’t been mangled beyond the point of possible restoration.

Even after reading a lot about encoding/decoding, charsets, DBs & connections, I still have no idea what I’m doing.

Some suggestions:

  • do a database dump to a file, and check how (and if) the characters look (hopefully good) there too. If not, how do they look? Post them here.
  • if they look good in the file, then I’d say the issue may be in the way TXP is connecting to the DB. Have you checked the dbcharset setting in textpattern/config.php?
  • you could check if rvm_latin1_to_utf8 plugin helps

Please, remember to do a backup before doing anything :)
If nothing works, come back with your findings and let see if there is something else to try.


La música ideas portará y siempre continuará

TXP Builders – finely-crafted code, design and txp

Offline

#6 2013-06-22 04:22:20

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,007
Website GitHub Mastodon Twitter

Re: Character encoding problem on re-import of textpattern database

If you are using utf8 already, adding AddDefaultCharset utf-8 in your htaccess might help.


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#7 2013-06-25 15:22:06

alesh
Member
From: Miami, FL
Registered: 2005-04-13
Posts: 228
Website

Re: Character encoding problem on re-import of textpattern database

Crap … this is totally unsatisfying. For the last three days I’ve been struggling at the command line (not my natural habitat) to download and re-import the entire DB to see if that’d make a difference. No luck yet on that front, but I just tried adding AddDefaultCharset utf-8 to .htaccess and it seemed to work… until I tried removing it again and the problem seems to be resolved anyway.

Perhaps I did something inadvertently that fixed it? No idea. I’ll post if I figure out what the problem was, but in the meantime thanks everyone SO MUCH for the help. It means a lot not to be stuck alone in the wilderness when esoteric disaster strikes!


Yes, I have tried turning it off and on.

Offline

#8 2016-08-11 12:36:56

alesh
Member
From: Miami, FL
Registered: 2005-04-13
Posts: 228
Website

Re: Character encoding problem on re-import of textpattern database

Not sure if this would have helped me back in 2013, but: A seldom-noticed line in config.php is $txpcfg['dbcharset'] = 'utf8'; except unless you have an old Textpattern install that was originally set up on an old version of MySQL, in which case you might well have $txpcfg['dbcharset'] = 'latin1';.

If you import database tables from a latin1 install into an install that’s expecting utf8, you will get garbage in place of, in my case smart quotes, en dashes and the like. Simply changing config.php to specify latin1 solves the problem.

More info here.


Yes, I have tried turning it off and on.

Offline

Board footer

Powered by FluxBB