Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2010-01-29 08:41:34

Ingemar
New Member
Registered: 2010-01-29
Posts: 3

Article title disappears when it contains "å", "ä" or "ö".

Hi

I have a strange problem with my article titles. My web host (one.com) recenly upgraded their servers to PHP 5.3.1 which caused some problems with my page. I have solved all problems exept one:

When I use the swedish characters “å”, “ä” or “ö” in the article title, the article link dissapears from the page. The URL however works (I can manually enter the URL into a web browser and see the article but the article title is not there, the URL is correctly named changing “ä” into ae and so on).

I hope I have described the problem in a understandable way.

Can anyone help me here?

/Ingemar

Offline

#2 2010-01-29 10:23:26

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,270
Website GitHub

Re: Article title disappears when it contains "å", "ä" or "ö".

Ingemar wrote:

When I use the swedish characters “å”, “ä” or “ö” in the article title, the article link dissapears from the page.

Hmmm, very odd. I can’t replicate that here (though I’m only running 5.2.8 so maybe that’s the issue… or perhaps it’s a MySQL or character encoding thing?)

I made an article entitled Tack så mycket and its clickable URL title appears on my article list page as expected using <txp:permlink><txp:title /></txp:permlink> in my default Form.

Would you post your High diagnostics info please?


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#3 2010-01-29 10:35:56

Ingemar
New Member
Registered: 2010-01-29
Posts: 3

Re: Article title disappears when it contains "å", "ä" or "ö".

Yep, here it is:

Textpattern version: 4.0.6 (r2805)
Last Update: 2008-11-07 20:02:59/2008-11-07 18:50:40
Document root: /customers/ulrikawielander.se/ulrikawielander.se/httpd.www
$path_to_site: /customers/ulrikawielander.se/ulrikawielander.se/httpd.www
Textpattern path: /customers/ulrikawielander.se/ulrikawielander.se/httpd.www/textpattern
Permanent link mode: section_title
open_basedir: /customers/ulrikawielander.se/ulrikawielander.se:/var/www/diagnostics:/usr/share/php
upload_tmp_dir: /customers/ulrikawielander.se/ulrikawielander.se/tmp
Temporary directory path: files
Site URL: ulrikawielander.se
PHP version: 5.3.1
GD Image Library: version bundled (2.0.34 compatible), supported formats: GIF, PNG
Server Local Time: 2010-01-29 10:34:40
MySQL: 5.0.32-Debian_7etch11-log
Locale: en_US.UTF-8
Server: Apache
PHP Server API: cgi-fcgi
RFC 2616 headers: 0
Server OS: Linux 2.6.18-6-vserver-amd64

Pre-flight check:
————————————
File directory path is not writable: files
Temporary directory path is not writable: files
Some Textpattern files have been modified: /../index.php, /index.php, /publish.php
The following PHP functions (which may be necessary to run Textpattern) are disabled on your server: disk_total_space, diskfreespace, proc_nice
————————————

.htaccess file contents:
————————————
#DirectoryIndex index.php index.html

#Options +FollowSymLinks
#Options -Indexes

<IfModule mod_rewrite.c> RewriteEngine On #RewriteBase /relative/web/path/

RewriteCond %{REQUEST_FILENAME} -f [OR] RewriteCond %{REQUEST_FILENAME} -d RewriteRule ^(.+) – [PT,L]

RewriteRule ^(.*) index.php

RewriteCond %{HTTP:Authorization} !^$ RewriteRule .* – [E=REMOTE_USER:%{HTTP:Authorization}]
</IfModule>

#php_value register_globals 0

————————————

Charset (default/config): latin1/
character_set_client: latin1
character_set_connection: latin1
character_set_database: latin1
character_set_filesystem: binary
character_set_results: latin1
character_set_server: latin1
character_set_system: utf8
character_sets_dir: /usr/share/mysql/charsets/
17 Tables: textpattern is utf8, txp_category is utf8, txp_css is utf8, txp_discuss is utf8, txp_discuss_ipban is utf8, txp_discuss_nonce is utf8, txp_file is utf8, txp_form is utf8, txp_image is utf8, txp_lang is utf8, txp_link is utf8, txp_log is utf8, txp_page is utf8, txp_plugin is utf8, txp_prefs is utf8, txp_section is utf8, txp_users is utf8

PHP extensions: Core/5.3.1, date/5.3.1, ereg, libxml, openssl, pcre, sqlite3/0.7-dev, zlib/1.1, bcmath, calendar, ctype, curl, dba, dom/20031129, hash/1.0, filter/0.11.0, gd, gettext, session, iconv, standard/5.3.1, json/1.2.1, mbstring, mcrypt, mysql/1.0, SPL/0.2, mysqli/0.1, PDO/1.0.4dev, pdo_mysql/1.0.2, pdo_sqlite/1.0.1, Reflection/$Revision: 287991 $, imap, SimpleXML/0.1, soap, exif/1.4 $Id: exif.c 287372 2009-08-16 14:32:32Z iliaa $, sysvshm, tokenizer/0.1, wddx, xml, xmlreader/0.1, xmlrpc/0.51, xmlwriter/0.1, xsl/0.1, cgi-fcgi

pretext_data: array ( ‘id’ => ‘’, ‘s’ => ‘’, ‘c’ => ‘’, ‘q’ => ‘’, ‘pg’ => ‘’, ‘p’ => ‘’, ‘month’ => ‘’, ‘author’ => ‘’, ‘request_uri’ => ‘/e7760658aa2422efdd2d854996ee05f3/?txpcleantest=1’, ‘qs’ => ‘txpcleantest=1’, ‘subpath’ => ‘\\/’, ‘req’ => ‘/e7760658aa2422efdd2d854996ee05f3/?txpcleantest=1’,
)

/../index.php: r2774 (589604ea755bd3d30fd279719044b82b)
/css.php: r2772 (4807cbc15661213f2b4d0fd26c7179ff)
/include/txp_admin.php: r2729 (0c2b3cf59ff433c943bcc293a526651a)
/include/txp_article.php: r2680 (49a7155d831f843bcf3e8de306dfe7f1)
/include/txp_auth.php: r2728 (c472bfbe49a71fd35e89000c8a18de08)
/include/txp_category.php: r2243 (0ed99b6f44b5d221bdf35674240141ab)
/include/txp_css.php: r2730 (7974aa87728b39d3afaba5a3b18cf6b5)
/include/txp_diag.php: r2791 (aeb96445180b68c31821e237b6150332)
/include/txp_discuss.php: r2774 (852a8a4d4307358e161e0501124b7247)
/include/txp_file.php: r2530 (9f34fdbf98b9b649d65e2ced4c9ca763)
/include/txp_form.php: r1913 (780340d28f384113c72924843194b43e)
/include/txp_image.php: r2668 (11269b464db6cfa3affff47674533a50)
/include/txp_import.php: r1238 (86f0e64d2c9362066e6c48b9cd486e37)
/include/txp_link.php: r2463 (2379d25f83b37ec6c8d5f3edb1122ce8)
/include/txp_list.php: r2725 (1ed6c6f729eaeb7f8a582b27cd5b9e78)
/include/txp_log.php: r2796 (f249e0962a996f05041b899fea91ccae)
/include/txp_page.php: r2717 (807ff04b4a649b54b3d710c1ab0a428f)
/include/txp_plugin.php: r2774 (e9fdc47a3ed9bdd13197d929161c6a13)
/include/txp_prefs.php: r2528 (50bd3be8c22e17d5ca2855ccea081bac)
/include/txp_preview.php: r1238 (c45992b3273ac8019477e2f959d63120)
/include/txp_section.php: r2759 (9208297e0bd7b3d41bd0e6f9fc9ab120)
/include/txp_tag.php: r2774 (f371b400e8d7318e2ac48e032fe6c274)
/index.php: r2805 (c905bfea9031934a35e1f5cfddb3bd89)
/lib/IXRClass.php: r765 (0120eb4713c9b6446a0eebe8b1039d1c)
/lib/admin_config.php: r1747 (b972529744cb37a7695fe00316dada41)
/lib/class.thumb.php: r2329 (c7f66a32531f32d6dfcbe5c7d26c7852)
/lib/classTextile.php: r2779 (b6d5b9cecbc5bc6475b5d1ee6a5231ea)
/lib/constants.php: r2361 (5338211ece1b2592804acdd204c9df33)
/lib/taglib.php: r2612 (727737ebd08127c632b9822bae87fee0)
/lib/txplib_admin.php: r2726 (c4f65bac2ddef62867f5bfee97ad7dfe)
/lib/txplib_db.php: r2748 (3feb369b1c34f251815cd6085a216d62)
/lib/txplib_forms.php: r2759 (a2d3de62110e582fab2a3a20224661f4)
/lib/txplib_head.php: r2783 (74ced647523a94da307af9853d7ed596)
/lib/txplib_html.php: r2696 (57985ebd2501bc303d2e97ae7538db1f)
/lib/txplib_misc.php: r2788 (7ecfaa5d4fabefbf411d01615dea9485)
/lib/txplib_update.php: r1239 (e3bd2d0c2b491d4028a656b8301a0086)
/lib/txplib_wrapper.php: r2800 (4ad38ee67f3ee8d9e7b51544a4f0f58b)
/publish.php: r2777 (dc367ce5182948ca6cac6cd5334d3a04)
/publish/atom.php: r2774 (50aa384a2edf7cc07effee9020e0893b)
/publish/comment.php: r2776 (0e1ea64316087edcd75f394494b42100)
/publish/log.php: r1637 (f69237dc2ff39bd7a691c8ca1bc87808)
/publish/rss.php: r2793 (022caa22c756c64f2255aae6625686d8)
/publish/search.php: r1748 (ea84e04b2c688b0bb8b5a9ecf395749a)
/publish/taghandlers.php: r2774 (59dc36e6dabc619e23c43f722fe7b8f1)
/update/_to_1.0.0.php: r711 (0f49fca8fbd8e6fca0fc48b0f69f0461)
/update/_to_4.0.2.php: r711 (e77c0e0d972868f19eaee4565bd0b4c4)
/update/_to_4.0.3.php: r711 (f5506cfd0fbc3ad4bd9a9b2299468775)
/update/_to_4.0.4.php: r711 (4d867b42ee87a7f11d2bff3a8e91bed0)
/update/_to_4.0.5.php: r2464 (dbe80cd4a775d3a43a203c3c4a2d0e3f)
/update/_to_4.0.6.php: r2464 (7e5ae73eb64c24438918697089a1f321)
/update/_update.php: r2792 (6ff7b4dedb2c7735a01e76b13b3f1fb1)

Offline

#4 2010-01-29 11:36:55

Bloke
Developer
From: Leeds, UK
Registered: 2006-01-29
Posts: 11,270
Website GitHub

Re: Article title disappears when it contains "å", "ä" or "ö".

Thanks. Great looking site, btw.

I’m not fabulously well versed in character set issues (being a lowly Englishman) so I’m sure someone else will be able to chime in if your issue is character-set based. A few observations that might be worth investigating:

Textpattern version: 4.0.6 (r2805)

Any chance of upgrading? There may have been fixes that deal with this since then.

Some Textpattern files have been modified: /../index.php, /index.php, /publish.php

Are they big changes that might impact article titles?

I flicked through the PHP 5 changelog to see if I could spot anything that might affect TXP’s ability to display non-ascii chars but have so far not turned up anything concrete. Hopefully someone else who knows more about this kind of stuff can shed some light on this. Or if someone else could confirm it’s an issue we can start narrowing it down a bit to work out where the problem lies.


The smd plugin menagerie — for when you need one more gribble of power from Textpattern. Bleeding-edge code available on GitHub.

Txp Builders – finely-crafted code, design and Txp

Offline

#5 2010-01-29 13:04:42

Gocom
Developer Emeritus
From: Helsinki, Finland
Registered: 2006-07-14
Posts: 4,533
Website

Re: Article title disappears when it contains "å", "ä" or "ö".

Bloke wrote:

I flicked through the PHP 5 changelog

PHP is famous from incomplete changelogs. They have dirty little secrets ;)

Me: looking at php changelog.
Brain: Okay should be fine.
Finger: Clicks Download.
Mouse: Press extract.
Keyboard: installs.
TXP: I don’t like PHP5, throws up.
Me: bitch slaps TXP.
PHP5: laughs.
MySQL: Gangs up with TXP against me.

(Actually PHP’s changelogs are pretty damn fine, some missing things tho. Just saying).

Offline

#6 2010-01-29 18:01:57

the_ghost
Plugin Author
From: Minsk, The Republic of Belarus
Registered: 2007-07-26
Posts: 907
Website

Re: Article title disappears when it contains "å", "ä" or "ö".

Do other non-ASCII chars crush site?


Providing help in hacking ATM! Come to courses and don’t forget to bring us notebook and hammer! What for notebook? What a kind of hacker you are without notebok?

Offline

#7 2010-01-29 18:08:43

Ingemar
New Member
Registered: 2010-01-29
Posts: 3

Re: Article title disappears when it contains "å", "ä" or "ö".

No, the article itself works just fine with å, ä and ö. It is only the title that disapears.

Offline

#8 2012-12-07 08:38:39

lazlo
Member
Registered: 2004-02-24
Posts: 110

Re: Article title disappears when it contains "å", "ä" or "ö".

If this hasn’t been answered I think the encoding on the MYSQL server may be part of the problem.

It should be like this:
mysql> show variables like ‘char%’;

Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/

More info here: Getting out of MySQL Character Set Hell

The second part of the problem may be how textpattern stores <txp:Title >.

If the title contains both UTF-8 and Latin ISO-8859-1 in the title and the MYSQL database is set to Latin the I believe it may crap out.
AND something about the <txp:Title > screws up my Paypal receipt, I believe it has latin encoded non-breaking spaces that are improperly parsed by my Mysql, when passed to an outside entity. This offending character  appears in the title of my payapal receipt where I assume <txp:Title> is inserting the latin character for non-breaking space 0xA0. Nonbreaking space is 0xA0 in ISO-8859-1, and when it comes up in UTF-8 it is 0xC2A0. This will look like “ “.

The solution to this on textpattern side would be to use &nbsp; not the nonbreaking space 0xA0 of ISO-8859-1. This needs to be confirmed and I don’t now how to do it.

Last edited by lazlo (2012-12-07 09:01:08)

Offline

#9 2012-12-07 21:32:22

ruud
Developer Emeritus
From: a galaxy far far away
Registered: 2006-06-04
Posts: 5,068
Website

Re: Article title disappears when it contains "å", "ä" or "ö".

Topic starter has this setup:

Charset (default/config): latin1/
17 Tables: textpattern is utf8, txp_category is utf8, txp_css is utf8, txp_discuss is utf8, txp_discuss_ipban is utf8, txp_discuss_nonce is utf8, txp_file is utf8, txp_form is utf8, txp_image is utf8, txp_lang is utf8, txp_link is utf8, txp_log is utf8, txp_page is utf8, txp_plugin is utf8, txp_prefs is utf8, txp_section is utf8, txp_users is utf8

That first line should be: Charset (default/config): latin1/utf8
Perhaps $txpcfg['dbcharset'] = 'utf8' is missing from the config.php file.

Offline

#10 2012-12-08 17:02:57

Gocom
Developer Emeritus
From: Helsinki, Finland
Registered: 2006-07-14
Posts: 4,533
Website

Re: Article title disappears when it contains "å", "ä" or "ö".

lazlo wrote:

If the title contains both UTF-8 and Latin ISO-8859-1 in the title

Contains two character sets? … Um, what?

I assume <txp:Title> is inserting the latin character for non-breaking space 0xA0. Nonbreaking space is 0xA0 in ISO-8859-1, and when it comes up in UTF-8 it is 0xC2A0. This will look like “Â “.

Textpattern doesn’t add non-breaking space characters to titles. The title tag has no-widow feature where it prevents the last word from wrapping to second line by replacing the last space with non-breaking space. But not with ISO-8859-1 encoded character, but with a HTML entity. Any data handled by Textpattern is stored and kept intact as UTF-8.

The solution to this on textpattern side would be to use &nbsp; not the nonbreaking space 0xA0 of ISO-8859-1. This needs to be confirmed and I don’t now how to do it.

We don’t use either. Replacing characters with HTML entities internally isn’t a fix to anything. It’s same as you encoding your pages with one method and serving them as other, and then fixing the issues by encoding everything as HTML entities. The wrong way to do it.

As far as I see, there is no issue on Textpattern’s end. Likely the character you see in your Paypal receipts is that HTML entity. You can remove that entity by setting no_widow="0" in the tag. What likely is contributing to the issue, is that you are not encoding URLs and how UTF-8 characters are handled at the PayPal’s end and the setting your PayPal account uses. For instances, note that you should be URL encoding any values you send to PayPal. You can’t really directly use title tag in an PayPal URL, or any other query string.

Last edited by Gocom (2012-12-08 17:10:16)

Offline

Board footer

Powered by FluxBB