Go to main content

Textpattern CMS support forum

You are not logged in. Register | Login | Help

#1 2005-11-14 01:50:37

zem
Developer Emeritus
From: Melbourne, Australia
Registered: 2004-04-08
Posts: 2,579

Assignment: RSS and Atom tests, take 2

Following on from this thread, here’s a repost with the same data, hopefully presented in a clearer fashion.

Here is a collection of sample Textpattern articles, in plain text and MySQL formats:

articles.txt
articles.sql.txt

Here are the RSS and Atom feeds produced by Textpattern 4.0.2 from those articles in “long” (syndicate body) mode:

atom_1_0_long.xml
rss_0_92_long.xml
rss_2_0_long.xml

And here are the RSS and Atom feeds in “short” (syndicate excerpt) mode:

atom_1_0_short.xml
rss_0_92_short.xml
rss_2_0_short.xml

There are plenty of problems demonstrated here. The sample articles have been carefully constructed to include lots of tricky cases: textile and html, txp tags, various types of entities, utf-8, and so on.

In some cases, there are simple problems of the kind detected by feedvalidator. This is most obvious with the “RSS 2.0” feed, which is not a valid 2.0 feed at all, but a straight copy of the 0.92 feed with the version number changed.

The problems we need help with are more subtle than validation: things that the feed validator would pass as valid, but are not correct: i.e. where the final output as displayed by a feed reader doesn’t match the original article. (Hypothetical example: a timestamp that’s formatted correctly, but shows the wrong time – feedvalidator won’t know it’s wrong). Other things might be correct, but we simply don’t know for sure. (For example, the encoding rules for the RSS description field: are we supposed to entity-encode, double-encode, CDATA-encode, or not encode at all?)

What we need is for people to examine these XML feeds, compare them with the sample articles, and tell us where things are wrong, and what the output should be. Check them by hand in a text editor, if that’s your thing, or load them up in various feed readers and check them there. (Though those links above are static files, you should be able to subscribe to them as if they were real feeds)

If you find something wrong, edit and correct the XML source of the feed in a text editor, and email a copy to me, or post it here. Or simply post a clear description of what’s wrong, and what it needs to change to. I’ll endeavour to keep those URLs updated with changes as they’re submitted, so we can fix one thing at a time.

Once we know exactly what the feed output is supposed to be, updating the code to produce it is a piece of cake.

Last edited by zem (2005-11-14 02:08:34)


Alex

Offline

#2 2005-11-22 01:55:12

zem
Developer Emeritus
From: Melbourne, Australia
Registered: 2004-04-08
Posts: 2,579

Re: Assignment: RSS and Atom tests, take 2

For example, the encoding rules for the RSS description field: are we supposed to entity-encode, double-encode, CDATA-encode, or not encode at all?

Things are even more complex in RSS 2.0, it seems. Some possibilities:

1. Stripped HTML in the description field
2. Entity-encoded HTML in the description field
3. CDATA-encoded HTML in the description field
4. Entity-encoded HTML in the content:encoded field
5. CDATA-encoded HTML in the content:encoded field
6. Unencoded HTML in the xhtml:body field

Further complicating matters, it seems likely that no single option will work well with all aggregators (except for (1), which is hardly desirable). So it’s probably necessary to include the description field as well if one of (4,5,6) is used. Then there’s the excerpt/body complication: which goes where, and what do we do if one or both are empty?

Seriously, people: this stuff is taking up a huge amount of time that could otherwise be spent creating new features. Help free up the dev team to work on new things by contributing here. You don’t have to write code, just investigate the options, modify the sample feeds, and test them in aggregators and the feed validator.


Alex

Offline

#3 2005-11-22 02:32:31

Mary
Sock Enthusiast
Registered: 2004-06-27
Posts: 6,236

Re: Assignment: RSS and Atom tests, take 2

Um, all the xml files above have parsing errors according to Firefox. I wish I could help more than that, but I haven’t a clue what should be displayed. For my own feeds I did #5 of your list, and it took me quite some searching and time to figure even that out.

So for Atom:

<content type="text/html" mode="escaped">
<![CDATA[
html here
]]>

and RSS:

<content:encoded>
<![CDATA[
html here
]]>

But, that’s probably not helpful, right?

Offline

#4 2005-11-22 03:01:37

zem
Developer Emeritus
From: Melbourne, Australia
Registered: 2004-04-08
Posts: 2,579

Re: Assignment: RSS and Atom tests, take 2

That’s one possibility, but the RSS 2.0 spec says <description> is a required element, and it seems some aggregators ignore <content:encoded>.


Alex

Offline

#5 2005-11-22 04:08:12

squaredeye
Member
From: Greenville, SC
Registered: 2005-07-31
Posts: 1,495
Website

Re: Assignment: RSS and Atom tests, take 2

zem,
Not sure if this helps, but I wasn’t able to read any of the RSS feeds in Safari’s native reader?
The Atom feed seemed to display appropriately though?

Matthew


Offline

#6 2005-11-22 04:23:20

zem
Developer Emeritus
From: Melbourne, Australia
Registered: 2004-04-08
Posts: 2,579

Re: Assignment: RSS and Atom tests, take 2

The Atom feeds are pretty good, as far as I’ve been able to determine. There are a couple of cases it gets wrong, but they’re relatively obscure and pedantic (named entity in title, empty excerpt/body). Most aggregators that correctly support Atom 1.0 appear to display them correctly (with the exception of FeedDemon, apparently).


Alex

Offline

#7 2005-11-22 05:51:15

Mary
Sock Enthusiast
Registered: 2004-06-27
Posts: 6,236

Re: Assignment: RSS and Atom tests, take 2

description is only required for the channel element, not item; the item element only requires a title or a description, but not both.

Ignores content:encoded? Remember if you use it you have to declare that namespace. See my feed. But if they’re ignoring valid xml/rss, should we care? ;D

Offline

#8 2006-01-11 18:39:12

mr.riff
Member
From: H-bad, India
Registered: 2006-01-11
Posts: 10
Website

Re: Assignment: RSS and Atom tests, take 2

I must admit that I’m not completely certain of what’s being discussed here, but I’ve recently coded up an RSS 2.0 feed for my textpattern install using php & textpattern’s sections/pages/articles and it works perfectly everywhere I’ve tested it (Firefox, Flock, Google Reader, Opera, Feed Demon, etc).

It’s available here: http://itch.in/feeds/journal-itch.xml/

Here is a snippet of what I’m using:

<code>
<item>
<title><txp:title /></title>
<link><txp:permlink />/</link>
<description><txp:php> echo str_replace (array (‘&’,’”’,”’”,’<’,’>’), array(‘&amp;’,’&quot;’,’&apos;’,’&lt;’,’&gt;’), $thisarticle[‘body’]); </txp:php></description>
<pubDate><txp:posted format=”%a, %d %b %Y %H:%M:%S +0530” /></pubDate>
</item>
</code>

The only drawback is that its served as text/html (txp’s default for pages) instead of the required application/rss+xml. Hope this helps somewhat.


I’ve PIZZA.

Offline

#9 2006-01-17 21:54:10

zem
Developer Emeritus
From: Melbourne, Australia
Registered: 2004-04-08
Posts: 2,579

Re: Assignment: RSS and Atom tests, take 2

I’ve added a page to the Feature Requests section in TextBook.


Alex

Offline

#10 2006-01-17 22:08:34

Dawk
Member
Registered: 2005-02-22
Posts: 74

Re: Assignment: RSS and Atom tests, take 2

I get parse errors on all 6 .xml feeds using the Sage ext for Firefox.

Firefox: v1.0.7
Sage: v1.3.6

mr.riff’s feed in post#8 loaded & displays fine.

I just checked the feeds on my new 4.03 txp install, both the RSS & Atom feeds load fine.

I’ll try to help out on this as much as I can.

Last edited by Dawk (2006-01-17 22:46:11)

Offline

#11 2006-01-17 22:30:18

KurtRaschke
Plugin Author
Registered: 2004-05-16
Posts: 275

Re: Assignment: RSS and Atom tests, take 2

Dawk, have you seen the post Troubleshooting Feed Issues? It may give you some useful guidance in getting your feeds working.

-Kurt

Last edited by KurtRaschke (2006-01-17 22:30:38)


kurt@kurtraschke.com

Offline

#12 2006-01-17 22:47:18

Dawk
Member
Registered: 2005-02-22
Posts: 74

Re: Assignment: RSS and Atom tests, take 2

Thanks Kurt,

It was an unencoded ampersand in my tagline (doh) both validate & work fine now.

Offline

Board footer

Powered by FluxBB