The RSS Blog

News and commentary from the RSS and OPML community.

Recently, I wrote the schematron schema to validate Atom version 0.3. I wrote the rules to conform as closely as I could to the actual Atom 0.3 specification. I finished the schematron earlier this week and immediate set out to find how valid the existing Atom feeds are.

Blog Really  Simple Validation FeedValidator
Intertwingly (Sam Ruby)

Invalid

Invalid

evhead (Evan Williams)

Invalid

Invalid

Six Apart

Valid

 Valid

The Real Geek on Blogger (me)

Valid

Valid

.Conform (Philippe Janvier)

Valid

Valid

.Conform Blogmarks

Invalid

Valid

Salad w/ Steve (Steve Jenson)

Valid

Valid

Atom Enabled

Invalid

Invalid

That should be enough to prove my point. About half of the Atom feeds are invalid. I should also point out that the FeedValidator was incorrect more often than not, pointing out validation issues that didn't exist in the spec and missing other validation issues present in the spec. Why is it so difficult to create a valid Atom feed? The problem is that Atom is simply too complex. Examples of this complexity follow:

  • Some date timezones are optional and others are required.
  • Some date timezones must be UTC and others can be any timezone.
  • Relative URL w/ xml:base.
  • The content constructs type and mode attributes.
  • The default for type and mode are text/plain and xml, which makes for confusion when both defaults are selected.
  • How do you interpret an entity-encoded (or CDATA) string with a type of text/html and a mode of xml?
  • How do you interpret an entity-encoded (or CDATA) string with a type of text/plain and a mode of xml?
  • There are simply too many optional elements to choose from.

Update: Here's another example of why Atom is complex. This fragment is from Kevin Mark's feed, blogger extraordinaire at Technorati.

<info mode="xml" type="text/html">
   <
div xmlns="http://www.w3.org/1999/xhtml">...</div>
</
info>

I added a new test case to my validator to flag this.

Reader Comments Subscribe
Perharps would it be better to conduct such a poll on a larger sample, though ;)

--philippe
It would. Maybe I'll put something together.

Randy

Your feed and Six Apart are valid, they just get warnings in the feed validator. That's different.
Type "339":