RSS Scaling - RSS
Really Simple Syndication
Copyright 2003-4 Randy Charles Morin
<< Previous Main Next >>
Fri, 30 Jul 2004 00:04:46 GMT
RSS Scaling

Jeremy Zawodny: If you had ideas about the previously noted RSS Scaling Problem and are at OSCON now, stop by Douglas Fir to chat about it. Here's the session blurb:

With RSS usage growing like mad, does having every client pull their own copy still make sense? Might it make more sense for some centralized services to aggregate the data? Or maybe a push system? PubSub? Other ideas?
Let's toss some ideas around.

Or see the session page.

Randy: I think the scaling concern is exaggerated. More to come! First supper.

Supper finished, let's continue. Let's begin w/ stating my theory. Those that are complaining are not implementing existing techniques for reducing the RSS load. If the load is so light that they don't implement existing techniques, then why do we need more techniques?

The current outbreak of bandwidth complaining started w/ Chad Dickerson. He complained, w/ popups in place, that his company, Inforworld's readers were wreaking havoc on their Web server. As Dare pointed out, Chad's ignorance was to blame.

And now we have Jeremy Zawodny. Jeremy is telling the world, via his RSS feed, to pull his feed every hour for the rest of time. He could easily have used <skipHours> to reduce this considerably. This from a person who posts on average once per day. Why is he telling me to pull his feed hourly, when he posts daily? But a closer look at his feed reveals that we are actually pulling twice as much content as is necessary. He provides both a partial and full content version of his blog posts. Why do you need a partial, if you already have a full content version? Further, his feed includes posts that are over two weeks old. Why? This is a complete waste of bandwidth.

Now, let's bring this one level of indirection outward. Jeremy works for Yahoo! I'm not going to blame him for Yahoo!'s problems, but I'd like the opportunity here to critique Yahoo! feeds. Which isn't much hard at all. Their feed has a TTL caching hint of 5 minutes and no other syndication hints are provided. Should I poll each and every feed on their site 288 times per day? No wonder it doesn't scale. I wonder, do they implement a conditional GET? I'll let the reader guess, if they don't already know the answer. Off topic, please tell me their GUID is unique.

Now don't get me wrong, every bandwidth reducing hint isn't a good hint. Over a Sam's blog it was suggested that you respond HTTP 301 to a live link. That doesn't make any sense at all. Why would you say a link on your Website respond "permanently moved". I understand if someone has bookmarked the link, that you would want to respond accordingly. Or, if someone linked to this address, then by all means. But never, ever respond HTTP 301 to a link on this same Website. A better approach is to change the originating link.

Reader Comments Subscribe


Favor excluir... foi só um teste.
Type "339":
Top Articles
  1. Unblock MySpace
  2. MySpace
  3. FaceParty, the British MySpace
  4. and
  5. Blocking Facebook and MySpace
  1. Review of RSS Readers
  2. MySpace Layouts
  3. RSS Stock Ticker
  4. RSS Gets an Enema
  5. Google Reader rejects