Misunderstanding RSS

Thu, 05 Apr 2007 00:35:24 GMT

Alex Iskold at Read/WriteWeb brings up several misunderstandings of what RSS is in his article today called The Future of RSS. I want to point out these flaws and misunderstandings so that people don't make the same mistakes Alex made. Let's start with the basics. The RSS that he presents is invalid. In fact, if you try to validate it, then you don't get much of a response other than this is not RSS.

<channel>
  <title>Read/WriteWeb</title>
  <link>http://www.readwriteweb.com/</link>
  <description>Web Technology news, reviews and analysis</description>
  <lastBuildDate>Mon, 02 Apr 2007 15:23:01 -0800</lastBuildDate>

  <item>
    <title>Morfik Patents AJAX Compiler</title>
    <description>Morfik Patents AJAX Compiler...</description>
    <link>http://www.readwriteweb.com/...</link>
    <category>News</category>
    <pubDate>Mon, 02 Apr 2007 15:23:01 -0800</pubDate>
    <author>Richard MacManus</author>
  </item>

  <item>
    <title>EMI Music DRM-free</title>
    <description>Morfik Patents AJAX Compiler...</description>
    <link>http://www.readwriteweb.com/...</link>
    <category>News</category>
    <pubDate>Mon, 02 Apr 2007 15:23:01 -0800</pubDate>
    <author>Richard MacManus</author>
  </item>

</channel>

He made two mistakes in writing this sample RSS. First, he forgot to wrap the RSS in the root <rss> element. Further, he uses the <author> element inappropriately. The <author> element must be an email address. It could be of the form <author>randy@kbcafe.com</author> or <author>randy@kbcafe.com (Randy Charles Morin)</author>, but it cannot be simply the author's name. If you want to put the author's name without an email address, then you should use the Dublin Core creator element (sample shown).

      <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Richard MacManus</dc:creator>

I rewrote his RSS using proper semantics. You can download it here. But even this version lacks the recommended <guid> element. That's not so much an error, but it makes your RSS less useful, as you'll see later.

Alex also makes a lot of common mis-statement about RSS. Let's address each.

pubDate Does Not Indicate New Content

Alex says:

The on-demand aspect of RSS is enabled by two timestamps - the lastBuildDate in the channel indicates the last time this channel changed, while the pubDate of the item indicates when the item was published. RSS aggregators (a.k.a. RSS readers) take advantage of these timestamps to decide when new content is available.

I'm sure many RSS aggregators use the pubDate to determine new content, but this is incorrect. For instance, an item could be re-published and the pubDate may move forward. This does not make it new content. RSS aggregators should be using the <guid> element to determine if an item is new content. Remember that missing element I mentionned earlier. In the case where this element is missing, RSS aggregators use a variety of techniques to determine which elements are new. Some use the pubDate, but that element is also optional and is not always present. Some use the title, again, it's optional too. Regardless, when a <guid> is present, it's the only valid element that should be used to determine if there is new content.

RSS is Not Push

Alex says:

RSS is basically a filtered push - the user subscribes (pulls in) to channels that he/she likes, and after that content is delivered automatically.

RSS is not push and it's surely not filtered push. RSS is actually based on polling, not push technology. Further, filtered push is where items are filtered based on some sort of preferences and pushed. I don't see what this has to do with RSS.

RSS Can't Do Everything

Alex says:

Suppose your bank wants to deliver you statements in RSS instead of email. However if you use RSS as it is today, then the bank statements would need to be encoded in HTML - meaning no financial application would be able to manipulate the data. When your Quicken software connects to the bank, the information gets downloaded in a structured format. But with RSS, it is simply not possible currently - because there is no way to describe bank transactions using standard RSS.

What he is saying is that you cannot transmit banking information to banking software using RSS without using an extension. This is not actually true. You could easily use microformats to do this. But what is really puzzling about this statement is why do you need RSS as an envelop for financial data in the first place. We have OFX, which predates RSS and works just fine. RSS can't be expected to do everything. Not that it can't. You can also use RSS as an envelop for OFX, but why not just use OFX? I don't expect my dishwasher to do the laundry, even if it can.

Categories: rss

Reader Comments

Thu, 05 Apr 2007 01:55:51 GMT

User comment

Randy, I was planning to post about the problems in the R/W article, but it seems you've beat me to it :) Nice summary of the mistakes and false assumptions.

Thu, 05 Apr 2007 14:38:00 GMT

User comment

Thanks Nick!

Randy