Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Character set issues after hosts migration to new cloud platform
Bloke wrote back in 2016 #303432:
Ah, good ol’ character set issues.
They’ve struck again! Some pages are displaying with ? characters, and the body / excerpt fields are blank in TXP. This after a server migration to new cloud infrastructure. The site is running TXP 4.5.7.
Webhost support have drawn a blank on what to do next:
At first I thought it was a MySQL encoding issue, but I’ve verified the MySQL tables match the encoding settings in your config file.
There seems to be other encoding things being done by the CMS so the developers who are familiar with it will be the best people to find the problem.
When I view the tables everything seems fine (correct utf-8 characters); here’s a sql snippet from an affected page
'Choosing the best kitchen benchtop material: Caesarstone, Natural Stone or Ceramic Slab?', '', 'h4. *The problems with Caesarstone, engineered stone and reconstituted stone*\r\n\r\nWhile Caesarstone is a specific brand, you’ll see the name used interchangeably with “engineered stone” and “reconstituted stone”.
I’ve tried Ruud’s rvm_latin1_to_utf8 plugin, but TXP simply said MySQL was too old and deleted the plugin.
-- phpMyAdmin SQL Dump
-- version 4.7.7
-- https://www.phpmyadmin.net/
--
-- Host: localhost:3306
-- Generation Time: Jun 14, 2018 at 10:56 AM
-- Server version: 10.2.13-MariaDB-cll-lve
-- PHP Version: 5.6.30
Any advice on how I might coerce things back into shape?
Thanks!
Offline
Re: Character set issues after hosts migration to new cloud platform
For completeness, what output do you get if you run this (safe) query from phpMyAdmin:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'
Offline
Re: Character set issues after hosts migration to new cloud platform
Thanks.
I get:
character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8
character_set_filesystem: binary
character_set_results: utf8mb4
character_set_server: latin1
character_set_system: utf8
collation_connection: utf8mb4_unicode_ci
collation_database: utf8_general_ci
collation_server: latin1_swedish_ci
Does this mean the server needs reconfiguring to use utf8mb4, not my database?
Offline
Re: Character set issues after hosts migration to new cloud platform
As I understand it, utf8
can support U+000000 to U+10FFFF, and the characters you’re having trouble with are straightforward enough for utf8
out of the box, which implies that the enhanced range in utf8mb4
is sort of a red herring.
The empty body / excerpt is a bit of a bother, and although I’ve had it happen for a few of my clients, I ended up rebuilding the site piecemeal instead of investigating (time constraints). With that in mind, and depending on the scope of the site, here’s what I would do:
- full file + database dump from old (working) site on old hosting
- spin up a new, blank Textpattern inside the new hosting with default content on a subdomain or dev area, check article excerpt + body is appearing, check diagnostics for anything whacky
- dump that database as-is, and don’t overwrite your old dump
- look at the table constructs on each database and see what’s different (if anything)
- build a new database dump using the constructs from the new site, and merge in the presentational data from the old site
- import one article/page that was causing problems with characters and see if that resolves the issue
- if yes, victory beverage and rinse + repeat with the remaining content
Offline
Re: Character set issues after hosts migration to new cloud platform
Thank you Pete.
I’d been hoping that I could simply inform the web host ‘Hey, the server character set & collation don’t match’. :)
Gulp! I’ll give it a go.
Interestingly enough, 3 TXP installations were migrated to the new infrastructure:
- v4.5.7 – encoding problems, blank body and excerpt fields, but only on affected articles
- v4.6.2 – encoding problems, body and excerpt fields normal
- v4.7.0 – no problems
Offline
Re: Character set issues after hosts migration to new cloud platform
giz wrote #312583:
I’d been hoping that I could simply inform the web host ‘Hey, the server character set & collation don’t match’. :)
That’s one option – you can switch a database’s collation with (untested – backup first, etc):
ALTER DATABASE database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
There’s also phpMyAdmin -> Operations -> Collation, from memory.
Offline
Re: Character set issues after hosts migration to new cloud platform
Update:
A clean install of v4.5.7 fails drastically on the new server (fails to create tables). Setup database connection shows as latin1
1. You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'TYPE=MyISAM PACK_KEYS=1 AUTO_INCREMENT=2' at line 42
2. Table 'artedomu_test.v0_textpattern' doesn't exist
3. You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'TYPE=MyISAM PACK_KEYS=1' at line 10
4. Table 'artedomu_test.v0_txp_category' doesn't exist
5. Table 'artedomu_test.v0_txp_category' doesn't exist
etc
Next-up, v4.7.0 installs, but with warnings. Setup database connection on the identical database shows as utf8mb4
Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/artedomu/public_html/textpattern-4.7.0/textpattern/vendors/Textpattern/L10n/Lang.php on line 481
Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/artedomu/public_html/textpattern-4.7.0/textpattern/vendors/Textpattern/L10n/Lang.php on line 481
Warning: mktime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/artedomu/public_html/textpattern-4.7.0/textpattern/vendors/Textpattern/DB/Core.php on line 188
Warning: mktime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/artedomu/public_html/textpattern-4.7.0/textpattern/vendors/Textpattern/DB/Core.php on line 188
Upon login, I get a Javascript alert which needs dismissing scores of times before the page displays:
Warning "mktime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone.".
I got rid of the alert by setting the site status to ‘Live’. Other than that, 4.7.0 runs normally.
Is there any way to to get rid of the time-zone settings alert?
Thanks!
Offline
Re: Character set issues after hosts migration to new cloud platform
giz wrote #312585:
Is there any way to to get rid of the time-zone settings alert?
Yes — add this to your config.php
:
date_default_timezone_set('Pacific/Auckland');
Offline
Re: Character set issues after hosts migration to new cloud platform
Thanks Pete.
I’m going to clone the problematic 4.5.7 site to my Mac, upgrade it to 4.7.0, and then copy this back onto the new cloud server. With a bit of luck I’ll kill 2 birds with 1 stone…
Offline
Re: Character set issues after hosts migration to new cloud platform
giz wrote #312587:
I’m going to clone the problematic 4.5.7 site to my Mac, upgrade it to 4.7.0, and then copy this back onto the new cloud server. With a bit of luck I’ll kill 2 birds with 1 stone…
Go for it. Be aware that you might create different issues – unless you spin up a VM with the same versions of PHP, MariaDB etc – so don’t rush things and maybe go 4.5.7 -> 4.6 -> 4.7 in steps rather than single jump.
Offline
Re: Character set issues after hosts migration to new cloud platform
i once had a devil of a time getting an old site upgraded to utf8
i was using an old setup of:
character_set_server: latin1
collation_server: latin1_swedish_ci
And i wanted to go to utf8
it was a long time ago and i have in my notes this site.
https://docs.moodle.org/23/en/Converting_your_MySQL_database_to_UTF8
BACKUP before preceding. I repeat BACKUP
…. texted postive
Offline
Re: Character set issues after hosts migration to new cloud platform
Thanks Giovanni – thats very useful.
Offline