Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2006-06-11 05:24:19

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,351
Website GitHub Mastodon Twitter

numbers of special characters

Does anyone here know of any online resource listing the chr numbers of special characters?
ie something like

… , ellipsis , chr(133)

Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#2 2006-06-11 05:35:21

Mary
Sock Enthusiast
Registered: 2004-06-27
Posts: 6,236

Re: numbers of special characters

I just use the list from the HTML spec. I keep meaning to make my own nice list of them, but never get around to it.

Offline

#3 2006-06-11 06:57:58

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,351
Website GitHub Mastodon Twitter

Re: numbers of special characters

Mary wrote:

I just use the list from the HTML spec. I keep meaning to make my own nice list of them, but never get around to it.

Thanks Mary, but it’s not the html entities I’m looking for. It is the chr numbers. To explain. NeMe has just started a forum and some of its users copy/paste articles form the text processors (word etc). Those entries are not parsed properly by punBB causing the page not to validate. Up to here I have no problem except “version 1.3 is intended to be capable of being served as application/xhtml+xml and this problem could well cause a page to fail completely.” (quote form the punBB forum). I did ask the question there too and they offered this solution:

<pre>function CleanupSmartQuotes($text)
{
$badwordchars=array(
chr(145),
chr(146),
chr(147),
chr(148),
chr(151)
);
$fixedwordchars=array(
“’”,
“’”,
‘&amp;quot;’,
‘&amp;quot;’,
‘&amp;mdash;’
);
return str_replace($badwordchars,$fixedwordchars,$text);
}></pre>

My problem is that apart from the above, other characters are used, characters which I do not know where to find their chr(numbers)


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#4 2006-06-11 08:02:19

Mary
Sock Enthusiast
Registered: 2004-06-27
Posts: 6,236

Re: numbers of special characters

They are the same as the entity numbers. e.g: a copyright symbol is &copy; or &#169; or chr(169)

Offline

#5 2006-06-11 14:36:43

colak
Admin
From: Cyprus
Registered: 2004-11-20
Posts: 9,351
Website GitHub Mastodon Twitter

Re: numbers of special characters

Mary wrote:

They are the same as the entity numbers. e.g: a copyright symbol is &copy; or © or chr(169)

are you sure? i see for exampe
… , ellipsis , chr(133), has an entity of &amp;#8230;

Last edited by colak (2006-06-11 14:38:37)


Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.

Offline

#6 2006-06-11 15:48:18

Sencer
Archived Developer
From: cgn, de
Registered: 2004-03-23
Posts: 1,803
Website

Re: numbers of special characters

colak wrote:

Does anyone here know of any online resource listing the chr numbers of special characters?
ie something like … , ellipsis , chr(133)

chr() is a one-byte function. It only works with PHPs internal character encoding which is latin1. It breaks with unicode characters which are often encoded with multiple bytes (especially the special characters). It also doesn’t work well with Windows-1252 which differs from latin1 in a few places. See:

http://www.intertwingly.net/stories/2004/04/14/i18n.html (it also has tips on how to convert). [edit:fixed link]

What characterset is used on the forum? Is also explicitly declared on the pages where the forms are? Check that, because if not, it can lead to the problems you described.

Last edited by Sencer (2006-06-11 15:53:38)

Offline

#7 2006-06-11 16:00:18

Sencer
Archived Developer
From: cgn, de
Registered: 2004-03-23
Posts: 1,803
Website

Re: numbers of special characters

Or you can look at this table:

http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx

It has the hex-values for windows-1252 and the relating unicode-codepoints. 133 decimal is 85 in hex. And looking at the table for 85, you’ll see 85 = U+2026 : HORIZONTAL ELLIPSIS
2026 is hex, the according decimal value is 8230. If you are converting to numerical references you don’t have to do the last step, because: &#x2026 and &#8230 both work obviously.

Offline

Board footer

Powered by FluxBB