Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Pages: 1
Character Encoding Problems
When I try to save these:
č or ć
In an article or form they turn into �? and don’t show up after save/publish. Can anyone provide some suggestions as to what I could do?
Proud Canadian. Toronto Locksmith , Actualize Consulting
Offline
Re: Character Encoding Problems
Please post your high level diagnostics.
Offline
Re: Character Encoding Problems
(EDIT: REMOVED UNREQUIRED ELEMENTS FOR THE SOLUTION OF THIS PROBLEM)
——————————————————————————-
Charset (default/config): latin1/latin1
character_set_client: latin1
character_set_connection: latin1
character_set_database: latin1
character_set_results: latin1
character_set_server: latin1
character_set_system: utf8
character_sets_dir: /usr/share/mysql/charsets/
19 Tables: textpattern is utf8, txp_category is utf8, txp_css is utf8, txp_discuss is utf8, txp_discuss_ipban is utf8, txp_discuss_nonce is utf8, txp_file is utf8, txp_file: 2 clients are using or haven’t closed the table properly, txp_form is utf8, txp_image is utf8, txp_lang is utf8, txp_log is utf8, txp_page is utf8, txp_plugin is utf8, txp_prefs is utf8, txp_priv is utf8, txp_section is utf8, txp_users is utf8
(EDIT: REMOVED UNREQUIRED ELEMENTS FOR THE SOLUTION OF THIS PROBLEM)
Last edited by 1beb (2006-10-18 12:45:33)
Proud Canadian. Toronto Locksmith , Actualize Consulting
Offline
Re: Character Encoding Problems
One of the following things happened:
- Either you moved your database or backed up/restored in the wrong way
- Or you manualy edited or replaced config.php with one from the a different installation.
- Or your hoster made changes/updates to the server without informing you
Here is your problem:
Your config.php is configured to use latin1 tables as storage, yet the textpattern-tables in mysql are all created as utf8-tables. It is impossible for textpattern to do/change this on it’s own.
Charset (default/config): latin1/latin1 means configured to use latin1.
19 Tables: textpattern is utf8,[...] means tables are unicode.
The solution is that the config value (which is read from textpattern’s config.php) and the characterset of the tables must match. When that is the case, Textpattern diagnostics will simply display “xx Tables: OK”.
Here is how you coud go about a solution:
1) Back up the database, and check that the backup is correct and complete and that restoring it works.
2) Background: Now you have to decide whether you are going to change the value in config.php OR convert all the tables. The problem is that we don’t know what configuration you originally used and what was changed, so it’s a little bit of guess work. Either way will work fine for future posts/saves (the important thing is just that they match), the concern is about the data aready in the database. All things you saved to the database ever since the problem first occured you can disregard, as that will require manual fixing afterwards anyhow.
3) If you have backups form the time before the problem first appeared, restore that backup and work from there.
4) Change the config.php value to ‘utf8’, now go into diagnostic and check what the above quoted configurations values say. (the config value has to match the value of the tables, if it just says 19 Tables OK, that means they match).
5) Now check older posts (posts that you made before the problem occured, and that you haven’t touched since). Do they appear ok? If yes, you’re done and can now go to manually fixing the stuff that you had changed after the problem appeared and that still may display broken.
6) If NO (i.e., all older articles still display funny characters), then you have to choices: Either you manually edit old articles and/or live it, because the important thing is that things now work again. OR the alternative is that you try the other somwhat more complicated route, where you change the config.php value back to latin1, and take the necessar steps to also change the tabes back over to latin1.
Btw: You wrote that tyour problems appeared after an upgrade of Textpattern – that is a coincidence, as upgrading textpattern cannot cause such a problem. However int he process of updating Textpattern you might have taken other steps that caused the problem (i.e., anything that related to config.php or the database might have to do with it).
Offline
Re: Character Encoding Problems
So much to learn. Thanks for the explanation. I think I know which character set I was in previously. It was latin1. However, I had specified that textpattern switch to utf8 (at least that must be it because of this). I’m not sure exactly how the database was switched but that question is irrelevant at this point. I’m looking at it now and the main question that I have is this:
Is data stored in the database mutually exclusive of the manner in which textpattern displays it? Meaning, if I make it such that everything matches, will it work again from the backup’s that I have of the database?
Last edited by 1beb (2006-10-16 10:15:47)
Proud Canadian. Toronto Locksmith , Actualize Consulting
Offline
Re: Character Encoding Problems
I am not sure I understood that… Let me make the following statements which may or may not answer your questions:Is data stored in the database mutually exclusive of the manner in which textpattern displays it?
- The characterset specified for the rendered pages that are send to the browser is totally different and unconnected to what we are talking about here.
- Textpattern always handles and displays UTF-8. Even when tables are specified as latin1. We retrieve data the same way, we saved it, so for the most part it doesn’t make much difference. It does make a difference in cases where mysql needs to know about the type of data, like for example when searching or making direct string manipulations.
- utf-8 in mysql was introduced with mysql4.1, that’s why we have to deal with things being this or that way way at all.
- The mismatch (see above) between what is configured in textpattern’s config.php and how the tables are set in mysql, affects specifically one thing: The mysql-client/-driver that handles the communication between PHP and Mysql. There’s some automagic things happening, which lead to problems when there’s a mismatch. Most php-based applications do not set this value at all, and just rely on the defaults working (which evidently is not the case on several hosts), to work around this we have an explicit setting for it, which means textpattern works correct all of the time on it’s own, instead of most of the time – unless external changes are introduced.
- Writes made to the database when there is a mismatch are permanent, i.e., you have to manually fix things (overwrite the old stuff by resaving) when things are configured correctly again. All things that were not written to, will work again by themselves, because no wrong data was saved, it was only read the wrong way during the mismatch.
If you want to know the details, I’ve written about it here:
http://textpattern.net/wiki/index.php?title=Unicode_Support
Meaning, if I make it such that everything matches, will it work again from the backup’s that I have of the database?
It depends from what point in time the backup is. If it is from a time before things went wonky, then yes, things will work when you restore the backup, and make sure that config.php matches the tables.
Last edited by Sencer (2006-10-16 11:15:41)
Offline
Re: Character Encoding Problems
Hey guys! I’ve just moved hosts and I guess the DB got messed in the process. Anyway – here’s my question…
In another thread I read that character encoding problems can result in problem with search results. My site is entirely in Bulgarian (cyrillics) and before I moved to a new host, the search worked great (well, it was case sensitive, but that’s ok). Now I cannot get any results, except for english words (mostly names I use in my posts).
Here are the details from the diagnostics:
Charset (default/config): latin1/
character_set_client: latin1
character_set_connection: latin1
character_set_database: utf8
character_set_results: latin1
character_set_server: latin1
character_set_system: utf8
character_sets_dir: /usr/share/mysql/charsets/
19 Tables: OK
I suppose I have to manually convert the tables in some fashion, but I have no idea – how, what, where? Any help appreciated!
Offline
Re: Character Encoding Problems
There is nothing messed up from what you’ve posted. It looks like you’ve been on latin1 in the past, and still are on latin1 (See the line that says “19 Tables: OK”.) I am not sure why search does not work anymore for you, but it’s a different problem, that is best dealt with in a seperate thread (with complete diagnostics information posted)
Offline
Re: Character Encoding Problems
Thanks, will do.
Offline
Offline
Pages: 1