Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2004-02-22 20:19:55

mamash
Member
From: Prague
Registered: 2004-02-21
Posts: 127
Website

Textile Internalization

Some issues when using Textile to store non-English texts:
  • Encoding special chars into entities is generally a bad idea when used with non-Western scripts. I don’t see the point in entity coding, as long as correct charset declaration (UTF8) is intact. Two issues involved when using Textile:
    • Htmlspecialchars destroys Czech (and possibly other) scripts
    • Mb_encode_numericentity is a safer way, though it only encodes some foreign characters, while leaving others intact; therefore it’s useless IMHO.
  • The way Textile converts quotes and other characters is incompatible with non-English language and typography rules. For example:
    • The Czech rules requires different „quoting characters“.
    • „Nested quotes have to ‚differ‘ in Czech“.
    • «Russian quotes» are another fine example.

Now, wouldn’t it be fine, if Textile had some some sort of language detection mechanism – allowing the user to select a default language (in Textpattern) and change to others using a LANG tag? This would trigger different rendering behaviour. The list of supported languages could be expanded by competent users, otherwise Textile would default to English rules.


Who’s gonna textdrive you home tonight?

Offline

#2 2004-02-23 16:23:37

mamash
Member
From: Prague
Registered: 2004-02-21
Posts: 127
Website

Re: Textile Internalization

Not sure what you mean by “putting a lang parameter in the Textile class”, but Textile already recognizes a language setting tag. It doesn’t, however, trigger any special behavior (well, the browsers don’t really respond to it, either).

Some more food for thought: in Czech there are single-character prepositions that are not allowed to appear at the end of the line. Adding a non-breakable space is an easy solution to that. I’ve also been experimenting with word-hyphenation, but that would require referencing a hyphenation dictionary/database which would take more time than I’m currently willing to sacrifice.

My post was merely trying to suggest some possible future framework, as there is IMHO a great potential in turning Textile into a general typographical tool for virtually any language. Hmmm… is this sane enough, Dean? :)


Who’s gonna textdrive you home tonight?

Offline

#3 2004-02-23 16:34:25

Dean
Founder (Gone, but not forgotten)
From: Languedoc
Registered: 2004-02-14
Posts: 235
Website

Re: Textile Internalization

It surely is, and it raises much to think about. I want to do a lot of work on internationalising Textile. Let’s discuss it further when the other rhino has left the dinner table.


text*

Offline

#4 2004-02-23 16:41:09

pospel
Member
From: Ukraine
Registered: 2004-02-23
Posts: 40
Website

Re: Textile Internalization

> mamash wrote:

** Htmlspecialchars destroys Czech (and possibly other) scripts

Not only Czech, but any non iso-8859-1 encodings.

Offline

#5 2004-02-23 16:58:55

Dean
Founder (Gone, but not forgotten)
From: Languedoc
Registered: 2004-02-14
Posts: 235
Website

Re: Textile Internalization

htmlspecialchars() is surprisingly shitty for a PHP function – this is why Textile relies whenever possible on multibyte string functions. Mamash is right, however, in noting that mbstring encoding is only as effective as the map it is given – and the one currently made available on the Textile demo page is still decidedly (western) Euro-centric.

Though it’s still miles kilometres better than htmlspecialchars().


text*

Offline

#6 2004-02-23 18:38:12

mamash
Member
From: Prague
Registered: 2004-02-21
Posts: 127
Website

Re: Textile Internalization

I agree.

Still I’m not convinced that entity encoding is necessary in general. I mean that’s what charset declarations are good for, right?

Which brings up another issue: programming for a Firebird/PostgreSQL database interface would probably be a good thing, since Unicode support in MySQL is still alpha.


Who’s gonna textdrive you home tonight?

Offline

#7 2004-02-23 19:31:03

Dean
Founder (Gone, but not forgotten)
From: Languedoc
Registered: 2004-02-14
Posts: 235
Website

Re: Textile Internalization

Believe me I’d love to remove entities from the equation altogether, but bad browsers are still stinking up the place.

Every interaction Txp has with mysql now involves vanilla sql passed through a single safe_query() function, thus I believe a port to postresql, if someone wants to do it, would be pretty easy to pull off.


text*

Offline

#8 2004-02-23 19:47:48

mamash
Member
From: Prague
Registered: 2004-02-21
Posts: 127
Website

Re: Textile Internalization

Which “bad browsers” do you mean?

(Hm, maybe I should stop bugging you now, so that you could finally publish the gamma and allow me to start playing with it…)


Who’s gonna textdrive you home tonight?

Offline

#9 2004-02-24 07:23:29

RickCogley
New Member
Registered: 2004-02-24
Posts: 3

Re: Textile Internalization

Re Japanese:

  • basic things seem to work.
  • I have three articles, but search does not work.
  • some textile markup does not work – i.e. text where text is some Japanese string.

Regards


Rick Cogley :: Tokyo Japan

Offline

#10 2004-02-24 07:27:10

RickCogley
New Member
Registered: 2004-02-24
Posts: 3

Re: Textile Internalization

Re Japanese: text searching is NOT working from the site main page, as of the first gamma.

However it DOES work from the admin area.


Rick Cogley :: Tokyo Japan

Offline

#11 2004-02-24 15:45:07

pospel
Member
From: Ukraine
Registered: 2004-02-23
Posts: 40
Website

Re: Textile Internalization

I’ve changed utf-8 to windows-1251 in all “charset” places where it apear. Also i changed it all functions using encoding defined.

All seems work, but cyrillic text inputed is converted to their codes (&#symbol) and the code of a any page with russian letters apear to be overweighted. I guess, it can be easily patched, but i dont find yet where exactly. Maybe someone of community already done it?

Offline

Board footer

Powered by FluxBB