Textpattern CMS support forum
You are not logged in. Register | Login | Help
- Topics: Active | Unanswered
Pages: 1
#1 2008-01-17 03:29:07
- guiguibonbon
- Member
- Registered: 2006-02-20
- Posts: 296
& #38; in urls
Texpattern has the wisdom to change any ampersand in urls into an html entity. That’s great. Except it could better us &
instead of &
. Here’s why : all programming languages have the equivalent to php’s htmlspecialchars_decode() function but none of these function (including php’s) seam to tackle &
. They only transform &
. Also, there’s an issue with such functions as urlencode() which transform #
into %23
, thereby ruining the ampersand entirely. That of course does not happen with amp
because it contains no special character.
This poses an issue in the case of newsletter apps which track clicks. They use redirects, and just perform a regular decoding function on the urls they found in the html of the email. Meaning &
just stays. The bad news is urls containing the &# combination are plainly not valid and are being cut right after the & by the servers. There’s no way detecting it.
I emailed campaign monitor about the issue, and it sounds like they never had anyone complain about this. Maybe we should do like the rest of the world and use &
.
Last edited by guiguibonbon (2008-01-17 03:46:20)
Offline
Re: & #38; in urls
As I understand it, the numeric entities are used to avoid problems in RSS/Atom feeds, due to readers not being able to understand named entities.
Offline
Re: & #38; in urls
I do not know anything about the matter but just in case guiguibonbon is right re the observation, what if the numeric entities are only used in the excerpt field?
Yiannis
——————————
NeMe | hblack.art | EMAP | A Sea change | Toolkit of Care
I do my best editing after I click on the submit button.
Offline
Re: & #38; in urls
I do not know anything about the matter but just in case guiguibonbon is right re the observation, what if the numeric entities are only used in the excerpt field?
Colak, what do you mean?
Only issue is that instead of special char we should use entitets, as php has it own funtions for that also. And sp-chars ain’t for urls, entitets are.
Cheers!
Offline
Re: & #38; in urls
htmlspecialchars() produces the escaping that ggbb wants, but TXP uses its own escape_output() function instead.
Depending on your preferences, body or excerpt is used in the feeds, so doing this just in the excerpt doesn’t help.
Offline
Re: & #38; in urls
ruud wrote:
As I understand it, the numeric entities are used to avoid problems in RSS/Atom feeds, due to readers not being able to understand named entities.
That would be strange. There are 5 predefined (named) entities in XML, and all XML processors are required to support those.
I’m not sure what Atom and RSS specs have to say on this.
Where is that emoji for a solar powered submarine when you need it ?
Sand space – admin theme for Textpattern
phiw13 on Codeberg
Offline
Pages: 1