Really Simple Syndication
Copyright 2003-4 Randy Charles Morin
A resulting template might look like the following.
<?xml version="1.0" encoding="<$MTPublishCharset$>"?>
<dc:date><MTEntries lastn="1"><$MTEntryDate format="%Y-%m-%dT%H:%M:%S" language="en"$><$MTBlogTimezone$></MTEntries></dc:date>
<admin:generatorAgent rdf:resource="http://www.movabletype.org/?v=<$MTVersion$>" />
<cc:license rdf:resource="<$MTBlogCCLicenseURL$>" />
<rdf:li rdf:resource="<$MTEntryPermalink encode_xml="1"$>" />
<item rdf:about="<$MTEntryPermalink encode_xml="1"$>">
<dc:date><$MTEntryDate format="%Y-%m-%dT%H:%M:%S" language="en"$><$MTBlogTimezone$></dc:date>
Note, I also added the content namespace declaration as an attribute in the rdf:RDF element.
With RSS usage growing like mad, does having every client pull their own copy still make sense? Might it make more sense for some centralized services to aggregate the data? Or maybe a push system? PubSub? Other ideas?
Let's toss some ideas around.
Or see the session page.
Randy: I think the scaling concern is exaggerated. More to come! First supper.
Supper finished, let's continue. Let's begin w/ stating my theory. Those that are complaining are not implementing existing techniques for reducing the RSS load. If the load is so light that they don't implement existing techniques, then why do we need more techniques?
The current outbreak of bandwidth complaining started w/ Chad Dickerson. He complained, w/ popups in place, that his company, Inforworld's readers were wreaking havoc on their Web server. As Dare pointed out, Chad's ignorance was to blame.
And now we have Jeremy Zawodny. Jeremy is telling the world, via his RSS feed, to pull his feed every hour for the rest of time. He could easily have used <skipHours> to reduce this considerably. This from a person who posts on average once per day. Why is he telling me to pull his feed hourly, when he posts daily? But a closer look at his feed reveals that we are actually pulling twice as much content as is necessary. He provides both a partial and full content version of his blog posts. Why do you need a partial, if you already have a full content version? Further, his feed includes posts that are over two weeks old. Why? This is a complete waste of bandwidth.
Now, let's bring this one level of indirection outward. Jeremy works for Yahoo! I'm not going to blame him for Yahoo!'s problems, but I'd like the opportunity here to critique Yahoo! feeds. Which isn't much hard at all. Their feed has a TTL caching hint of 5 minutes and no other syndication hints are provided. Should I poll each and every feed on their site 288 times per day? No wonder it doesn't scale. I wonder, do they implement a conditional GET? I'll let the reader guess, if they don't already know the answer. Off topic, please tell me their GUID is unique.
Now don't get me wrong, every bandwidth reducing hint isn't a good hint. Over a Sam's blog it was suggested that you respond HTTP 301 to a live link. That doesn't make any sense at all. Why would you say a link on your Website respond "permanently moved". I understand if someone has bookmarked the link, that you would want to respond accordingly. Or, if someone linked to this address, then by all means. But never, ever respond HTTP 301 to a link on this same Website. A better approach is to change the originating link.
Randy: An awesome article on a void filled by the blogger.
Marc Canter: I really don't care about the Democratic Convention (though I did in 1968) and I'm not really into following what bloggers do in general (unless it's about social networking or media as well) - but did I tell you how proud I am of Dave Winer and the other Convention Bloggers? Go dudes and dudesses - go.
Steve Rubel: I love Technorati, but this smells like dot com spirit all over again. Where's the moolah coming from to support a PR team of five? Hiring a PR firm before you can handle demand and squash bugs is looking for trouble. Hope they are ready for all the added attention. Just my two cents.
David Sifry in Steve's comments: Thanks for the feedback and criticism. A lot of your commentary is dead-on, for example, our biggest issue is making sure that that the service is reliable and accurate, 100% of the time. We're not there yet. I'm sorry that we haven't met your expectations, there's no good excuse for service failures.
Tim Bray: Before too much longer, there are going to be a lot of Web resources named this.atom, that.atom, and the-other.atom being dished out by Web Servers everywhere, and by default those servers are gonna look at the names and say âDot-atom what? Yer text/plain, punk.â So I appealed to Greg Stein of Apache and Google, and he had a pow-wow and reported back I've gone ahead and done this: the application/atom+xml (for .atom) type will appear in our next releases (Apache 1.3.32 and Apache 2.0.51), whenever those come out. Well, Apacheâs not the only server out there, so I wrote off to Obasanjo and Scoble and said âHereâs the problem, how about IIS?â. So Scoble did some digging and got routed to Thomas Deml, lead program manager on IIS, and I saw a forwarded email saying The change goes into Win2K3, SP1.
Randy: Sometimes, it's the small things that make big things happen. Down the road, we'll all forget this took place. Let's bookmark it to make certain that doesn't happen.
Ted: Fifty bloggers were credentialed as journalists to blog the convention from the Fleet Center. I just saw CNN cut to Dave Sifry of Technorati fame to tell CNN what the blogosphere was saying. Keep up with the bloggers at ConventionBloggers.com.
Randy: The blogosphere grows. But this is an Neil Armstrong-like giant leap. Thanks Sifry!
Randy: A new color scheme even; green and a couple grays. Unfortunately, it's as unreliable. I haven't got one query today to respond w/ any results. The part I don't get, most of all, is Sifry's complete denial that the service is effectively dead. His blog is littered w/ angry user comments.
leobard: Browsing along the PlanetRDF I came to Danny Ayers site which led me to the Syndication Subscription Service. This is a good example of how localhost integration works. It shows 15 popular news-aggregation systems and provides links that enable the user to add Danny's blog to their newsreader. Interesting is, that there si no common way of doing this: There is no "add this rss feed to my newsreader" system call in the operating system or the browser. All products implement different ways to do it. The locally instlalled systems open http ports at the locahlost and wait for http requests. The web based systems run on their servers (f.e. yahoo) and wait for the commando there.
Sifry: A few minutes ago CNN announced that Technorati will be providing real-time analysis of the political blogosphere at next week's Democratic National Convention. I will be on-site in CNN's convention broadcast center, along with Mary Hodder, and I'll be providing regular on-air commentary on what bloggers are saying about politics and the convention. And on Sunday, July 25, we'll launch a new section of our site for political coverage: politics.technorati.com.
Randy: Gotta wonder if this Website will respond once in awhile, during the convention. Funny, this page has 11 lines of XHTML that doesn't validate. I just don't get it. I understand, if you have a few warnings in a 100-1000 line page of HTML, but how can 11 simple lines not validate :(
Chad Dickerson, the CTO of Infoworld: Several months ago, I spoke to a Web architect at a large media site and asked why his site didnât support RSS. He raised the concern that thousands (or even millions) of dumb clients could wreak havoc on a popular Web site. [cut] As the popularity of RSS feeds at InfoWorld started to surge, I began to notice that most of the RSS clients out there requested and downloaded our feeds regardless of whether the feeds themselves had changed. At the time, we hadnât quite reached the RSS tipping point, so I filed these thoughts away for later -- but âlaterâ came sooner than I thought.
Dare: At this point I'd like to note that HTTP provides two mechanisms for web servers to tell clients if a network resource has changed or not. The basics of this mechanism is explained in the blog post HTTP Conditional Get for RSS Hackers which provides a way to prevent clients such as news readers from repeatedly downloading a Web document if it hasn't been updated. At this point I'd like to point out that at the current time, the InfoWorld RSS feed supports neither.
Randy: Dare shows that Chad is a not very proactive complainer. This is also a great read for anybody looking to cut down on their RSS bandwidth.
Scott: I know there is a wonderful I love RSS graphic that I want to use in the UI of a new Feedster feature but damn if I can find it. I'm certain that Bryan Bell did his magic on it but I've just not been able to find it. Any suggestions? Thanks in advance.
Dave: Tim only acknowledges the flames. But I've been quoting him on Scripting News for years. Yeah I'm angry with him, no question about that. But I think we have to work together, kind of like the Republicans and Democrats. He led a really awful anti-Dave jihad. That always ends a friendship. He doesn't want to own up to it, be a man, and retract what he said. Okay, I accept that. But I also know that I've done a lot to help his ideas get heard by the influential and smart people who read my blog. And he's using all the work I did with weblogs, aggregators and RSS, and by the way, not giving me very much credit for that, either.
Randy: The blogosphere is ripe w/ characters.
Evan: We just launched a spankin' new Blogger post editor, with rich-text, what-you-see-is-what-you-get formatting. Screenshot:
Randy: Gotta luv WYSIWYG.
Tantek: I have been a happy Feedster user for quite some time now. I've been impressed by the speed of their searches, not to mention their site stability. However, Feedster was updated recently, and as a part of their update, they apparently decided to be inaccessible to IE/Mac users, and therefore they've lost at least one user. I don't mean that their website breaks in IE5/Mac, I mean they literally send IE/Mac users a rejection page:
We hate to do this to you but if you want to use Feedster, you're going to need to update your browser to Safari, FireFox or another more modern browser. Internet Explorer on the Macintosh hasn't been updated now in years and we want to deliver you the best possible experience. We know updating a browser is just plain annoying but we also know that the newer browsers just plain work better and you'll probably be much happier.
Quote: Generic FeedParser interface and concrete implementations for Atom, FOAF, OPML and RSS. These FeedParser implementations are based on JDOM and Jaxen and is based around XPath and JDOM iteration. While the implementation is straight forward it has not been optimized for performance. A SAX based parser would certainly be less memory intensive but with the downside of being harder to develop.
Scott: You've seen the crashes, you've tolerated the slowness and you didn't yell (too) loudly. You're a great set of users and we love you all. So it is with great pride that I give you: Feedster Version 2.
Randy: Great news! A review of the new system will follow later today.
Update: My notes on the new Feedster.
Overall, my impression is that the site is completely broken. I have to laugh now at all the people who said the new site was great. I imagine they didn't actually use the site beyond viewing the new homepage before sticking their foot in their mouth. What complete ass kissing!
Update: I discussed w/ Scott at Feedster and he is fixing.
I stole the idea of commonAttributes from Norman Walsh's Atom schemas. Thanks Norman!
There's a suggestion on the Atom issue list to dictate the way clients determine the character encoding of feeds retrieved over HTTP. Most clients to date have used a lax method to allow as many feeds as possible to be internally parsed. Truth is, that character encoding should be determined using RFC 3023, but many Web developers believe this RFC to be broken.
Atom Wiki: RFC 3023 defines rules for determining the character encoding of a feed (or any other XML document served over HTTP). The default configuration for most web servers is to serve ".xml" files as "text/xml" with no charset parameter. According to RFC 3023, all of these feeds MUST be parsed as "us-ascii". This is nonintuitive and unacceptable for UnprivilegedUsers, who are left without a way to publish Atom feeds in any other encoding.
I don't entirely agree w/ the suggestions on the Atom Wiki, but I agree w/ the motivation. The motivation is to help users who have no choice but to serve their feeds as "text/xml". Redefining new rules because we disagree w/ them only means that existing tools that support RFC 3023 will not be fully useful.
Joe Gregorio: I initially proposed PacePutDelete and now I would like to withdraw it.
Randy: This should mean that the SOAP version of Atom can move forward unimpeded.
Elias Torres: I like how Randy and Tim are setting up the discussion. I tend to agree with both: we need to have order (be strict) for "dumb" tools to understand and we need to be as simple as possible for users to publish feeds. But, IF in fact this is a cost-benefit discussion, I would say that the users would benefit the most (especially since there are more of them) if we don't impose order. The number of definitive Atom libraries will be much smaller and good programmers write those, every other developer can just use them. Now, if a developer chooses to start from scratch, I think by using XML we have already given them a starting point, plus a XML Schema with choice groups. Lastly, all they would have to do is code up cardinality and their personal tweaks to their library.
Randy: Hyperlinks to context were added by myself.
Quote: Now you can access Alexa's Hot Search Terms, Movers & Shakers, or Top Sites with RSS.
Randy: The feeds are quite static, but the information is interesting. Take a peek at the top English sites. Dominated by scumware.
What I found most interesting about this feed is that it competes w/ my RFS feed, as one of the funkiest ever seen. Notes on the feed funkiness follow.
Norman Walsh: Ok, I started with the draft-...-00.txt spec and (re)built a RELAX NG Grammar for it.
Randy: Great stuff. Here's a pointer to Dave Pawson's original work.
Tim: The people who thought things were more or less OK offered these reasons: 1. Donât be a standards policeman, innovation is good, go with the flow. [cut] On the first argument, I need only respond: âWhat if Microsoft were doing this?â
Randy: What if IBM and Sun were doing this? I guess perspective is important. We don't like people to rewrite our own favorite standards, but surely other standards should be rewritten.
I thought it would be very interesting to list a top ten blogs to visit to find out how to implement or better your RSS feed and blog. You want to write your own blogging software or RSS feed, then you should be subscribed to each of the following. I excluded this blog, which is obviously #1 :).
Please suggest others. Et n'oubliÃ© pas Conforme pour nos frÃ¨res franÃ§ais.
Big in the blogosphere this week is an attempt to have feed publishers respect RFC 3023. Even the FeedValidator itself is being changed to warn users of pending doom, if they don't fixup those HTTP headers. And I think we've matured enough that we can expand the meaning of a valid RSS feed. So, here's my contribution to this adventure in RFC 3023-land.
Here's an ASPX file that examines an XML file's declaration and tries to fixup the content charset.
<%@ Page language="c#" %>
string uri = Request.Params["xml"];
if (uri == null || uri == string.Empty)
if (!uri.StartsWith("http://") && !uri.StartsWith("https://"))
uri = Server.MapPath(uri);
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
Response.ContentType = "text/xml";
if (doc.FirstChild.NodeType == System.Xml.XmlNodeType.XmlDeclaration)
System.Xml.XmlDeclaration decl = (System.Xml.XmlDeclaration)doc.FirstChild;
Response.Charset = decl.Encoding;
Here are the HTTP headers before...
HTTP/1.1 200 OK
Last-Modified: Thu, 08 Jul 2004 19:02:04 GMT
Date: Thu, 08 Jul 2004 20:34:05 GMT
...and after the tranformation.
HTTP/1.1 200 OK
Date: Thu, 08 Jul 2004 20:34:48 GMT
Content-Type: text/xml; charset=utf-8
I'll have to fixup the caching.
Tim Bray: The child elements of <atom:feed> may appear in any order, subject only to the constraint that the the <atom:entry> children appear in a group after all other child elements. That is to say, the "feed-level" elements must all appear before the first <atom:entry> element. The order in which the child elements of <atom:feed> appear is not considered significant. It's also been proposed that we consider imposing a *strict* ordering, something like title/author/copyright/dates/.../entries.
Randy: And cosmic balance returns.
David Winer: I tried to explain to Tim then (not that he was listening of course) that RSS was just part of the picture, and to see it only as an XML format was to miss the point, that there were applications on both sides of RSS, content management software and aggregators, and lots of people, that made it really work. To think you could swap out the format was as silly as thinking you could swap out HTML or HTTP in 1994.
Randy: Les War du Blogosphere.
Quote: Bloglines, the world's most popular free Internet service for searching, subscribing, publishing and sharing news feeds (RSS and Atom) and blogs, today celebrated its one-year anniversary by launching a raft of expanded features for its rapidly growing user base.
Randy: Happy birthday Bloglines! The new look is very impressive. The best new feature is that it shows exactly now many people are subscribed to each feed. iBLOGthere4iM has 26 subscribers. BoingBoing has 6851. iM almost #1 :)
This is the first entry in a monthly feature on this Really Simple Syndication blog where I induct one person into my RSS HoF. In order to qualify, you have to have an RSS feed and have contributed to the growth of the RSS industry.
Please suggest further candidates.
Checking my RSS Wishlist. #1 and #5 are done. Eight to go. Things I wish I'd get off my ass and do.
Some of you may wake up this morning to find that your RSS feed is invalid. If you use a standard setup of IIS and serve your RSS as rss.xml or any filename w/ the .xml extension, then your RSS feed may now be considered invalid. This was caused by a change to the FeedValidator over the weekend. The fix may also cause Manilla flavored RSS feeds to be issued warnings and even errors by the FeedValidator.
Instructions to fix can be found here.
Update: The errors may have been caused by a bad refactoring of the FeedValidator code. This would be good news and should result in a timely fix of the FeedValidator.
Update: The bad refactoring was confirmed by Joe on the iBLOGthere4iM blog. A fix is on its way.
Update: The problem w/ the FeedValidator has been fixed.
A new change to the Atom WSDL. The SOAP responses were previous wrapped. In discussion w/ Dave Orchard, I realized this is unnecessary.
The POST wrapping element in this response is unnecessary.
Quote: Steve Zellers has worked for over for over 15 years in Silicon Valley as a software engineer. Currently he works for the Spotlight team at Apple Computer, and is also responsible for various high level interapplication communication mechanisms such as AppleEvents, XML-RPC and SOAP. Formerly, he performed the initial port of the Java virtual machine for Macintosh as a contractor to Sun Microsystems, and is the author of the best selling screen saver, After Dark 3.0 for Macintosh. Steve is a fan of scripting languages and clean, incremental system designs.
Randy: It would seem that Steve Zellers has silently joined the RSS advisory board. Congrats!
Brad Feld: The misperception is that NewsGator is only an Outlook plug-in. While the most popular product from NewsGator is currently their Outlook-based aggregator, what really turned us on when we dug into NewsGator as a potential investment is NewsGator Online Services (NGOS). Greg Reinacker's vision is much broader than simply an RSS aggregator - his goal is to provide RSS content on any device. NewsGator currently provides clients for Outlook, the Web, POP email, mobile devices (web-based and wap), and Microsoft Media Center (how cool is it to get an RSS feed on your TV?).
Randy: It's great to have tech capitalists in the blogosphere. This will promote a good understand of how to create new companies out of Winer's chaos.
This is my new blog dedicated to Really Simply Syndication. By RSS I don't mean the format, flavor 2.0 or any other, rather I mean the bigger being (no not Winer). I use to blog about RSS, the bigger concept on my primary blog, iBLOGthere4iM, but it became a two phase blog, me and RSS. So, iM moving all that is RSS to this new blog and you can follow the all about me and my projects on my old blog. Subscribe to both, if you like, but the other blog, my primary blog is pretty boring.
Topics covered here...