RSS, OPML and the XML platform.
Copyright 2012 World Readable
I thought I'd prepare a blog entry describing the differences between PubSubHubbub and rssCloud. I'm doing this mostly for myself, as I'm currently implementing a desktop client based entirely on PubSubHubbub and rssCloud. My goal is to solve the NAT traversal problem using long polling thru a notification gateway. It's not the optimal solution, but maybe we can add a notification gateway to PubSubHubbub and/or rssCloud and make them work behind NATs and firewalls.
Both protocols are classic publish and subscribe. Publishers have a relationship with a hub. Clients subscribe to a hub. Publishers send updates to the hub. The hub pushes notifications out to the Clients. Nobody invented anything here, we've being doing this in computer science for a long time. Let's examine each of the four major interactions within the system; subscribing, unsubscribing, pinging and notification.
Subscribing is when a client tells the hub that it wishes to receive notification from one of its publishers. In this case, this might be a Web-based RSS aggregator (Google Reader) or a desktop RSS client (Feed Demon). In both cases, the client has polled an RSS or Atom feed and discovers that the feed has PubSubHubbub or rssCloud notification support. The client uses the information within the feed to send a simple HTTP request to the hub with varying parameters to setup the subscription. Very similar.
There are a couple small differences. PubSubHubbub only supports their REST API, while rssCloud supports all of XML-RPC, SOAP and REST. This would make rssCloud slightly more difficult to implement, as you have to account for three possible transports. Another difference is that rssCloud does not specify the target IP address of the client. Rather, it is assumed that the host of the request is also the notification target. You'll see later on that this makes implementing a notification gateway more difficult. The rssCloud protocol may include a parameter to allow passisng of the notification target's IP address in the near future.
There is one big difference. Because PubSubHubbub allows the subscriber to specification the notification ended, there is a greater possibility of malicious hackers or code subscribing a notification end-point against it's will. PubSubHubbub follows up all subscribing and unsubscribing requests by verifying with the client that their intent was true. The adds additional, but required complexity to PubSubHubbub. The rssCloud protocol may include subscriber verification in the near future.
Unsubscribing is when a client stops receiving notifications. With PubSubHubbub, unsubscribing involves the client sending a unsubscribe request to the server. With rssCloud, there is no request. Rather all subscriptions are automatically dropped after 24 hours. Don't think any of the two techniques are better than the other, there are advantages and disadvantages to both.
First off, I don't know anybody's software that is smart enough to unsubscribe when you close your laptop. Second, what happens when my laptop is closed and the hub is trying to send notifications? Are the notifications queued? How many failures before you unsubscribe the misbehaved (not really) client. Neither protocols is air tight and neither addresses numerous scenarios that arise frequently in homes and offices all across this Internet-enabled planet.
The ping component of both services are very similar. Both support a REST ping.
In addition to a basic REST ping, rssCloud allows the publisher to ping the cloud with all of REST, XML-RPC and SOAP. There doesn't appear to be a discover mechanism that tells publishers which of the protocols are accepted by the cloud, but this shouldn't be a much of a problem, since discover can occur via trial and error and the REST ping is likely to be supported by all rssClouds.
rssCloud does provide an additional lightly specificied interface that pushes the RSS feed to the cloud, allowing the cloud to host the RSS feed on behalf of the publisher. I highly doubt this would be widely used my many, unless the cloud implements more feed hosting services.
Notification is likely the biggest difference between rssCloud and PubSubHubbub.
rssCloud again allows for REST, XML-RPC and SOAP packages. This greatly increases the complexity of the cloud. The rssCloud notification is effectively a reverse ping, where the cloud ping the subscriber to tell it to fetch the feed and find out what changed.
PubSubHubbub implements a much more complex notificaton. It's not a simple ping, but rather an POSTed XML package contain the feed that has been updated, but with only the entries of the feed that are new or that have been updated. This creates an enormous amount of state problems within the hub. What happens when the previous ping failed? Do you send multiple updates in the next ping? This could mean sending different notifications to the subscribers or subscribers missing new and updated entries. On the other hand, this will avoid flooding the publisher with simulaneous feed fetches from all the subscribers who've been notified of the change. Neither approach is optimal, neither is horrible.
Both protocols have a major failing, in that they rely on servers connecting to subscribing clients. If a client exists behind an unfriendly NAT or firewall, then the protocols simply fail. You can implement UPnP and other protocols and break your way thru some NATs and firewalls, but the problem will still exist on a large piece of the Internet pie.
Long polling is the solution to the NAT and firewall problem. Long polling is not the optimal solution to the notification problem because long polling involves holding open connections from the client to the server. This means 10,000 clients will hold 10,000 connections open. Yikes! The real solution is to detect the capabilities of the client and use direct notification where possible, UPnP where possible and long polling where everything else fails. It might be difficult to convince a developer of rssCloud subscribing software to implement UPnP when he can simply resort to long polling from the beginning.
One great advantage of PubSubHubbub is that it is tightly specified with lots of examples and code. rssCloud on the other hand is very loosely specified with pieces of code and text found in various interlinked documents across many websites.
Please submit corrections in my comments or via email (firstname.lastname@example.org) where this document is incorrect. Thanks!