RSS, OPML and the XML platform.
Copyright 2003-5 Randy Charles Morin
TechCrunch has the best review of Yahoo! Mail and Yahoo! Alerts' new RSS integration. I'll have to check this one out for myself. An all-in-one Gmail-Google Reader?
Today is the last day of voting for round 1 of the Canadian Blog Awards. I have one blog up for nomination in the category of Best Sports Blog. Please take the time to vote for me today thanks. Step-by-step instructions follow.
Thank you very much!
About: Find out fast when fares to your favorite destinations drop by at least 20%! It's easy, thanks to Travelocity's new RSS feed.
Randy: I had to blog this. Why? Because, it's absolutely the worst implementation of RSS I've ever seen. It's not really a technical problem, it's more of a they don't get it. You see, I viewed a few of their feed and they all returned exactly the same thing. If you subscribe to their feed, then you get a constant reminder in your RSS aggregator that Travelocity has 20% off fairs to your chosen destination. That's it :-(
Question: I have an account on Flickr due to a digital imaging class I am taking at the local college. I want to start a blog to chronicle my RV travels and get it up and going before I leave in Jan on my Snowbird trip. I would like to be able to ask you some questions, such as how to have photos on Flickr that can be viewed from the blog, any advice you may have on setting up either the blog or Flickr (my experience on Flickr is limited and non-existant on a blog), etc.
And don't forget to check out my RVing blog. The last two winters, I've driven to DisneyWorld in March.
Randy: Scale (damn)it! Every Web 2.0 company eventually doesn't scale. Does anybody other than Google write Web-ware that scales?
Robert Scoble: I kept trying to open my OPML file in the OPML Editor and it wouldnât open. [cut] I tried both the OPML file that NewsGator exported as well as the one that Bloglines exported. Newsgatorâs OPML file wouldnât even open (gave me an error) but Bloglines opened with blank titles.
Randy: Sam Ruby runs Robert's OPML thru the top three OPML validators. Hmmm! This is a great use-case. When I export from Bloglines, I know they use title attribute instead of text. A classic bug. I simply search-and-replace "title=" with "text=" and it works.
Dave Winer: Weâre pretty close to having feeds.scripting.com ready for a reboot. Should happen this week. [cut] That is, Murphy-willing of course. ;->
Randy: For those that don't know, feeds.scripting.com allowed users to share their OPML and the results were aggregated into a top 100 list. About six months ago, it broke. I asked Dave and he said it was a resource allocation problem (he didn't have time for it). Now it's back! Or soon, Murphy-willing. Great News!
Jeff Jarvis: Hey, My Yahoo, Google Reader, Pluck, Newsgator Enterprise and other RSS readers: Hand over my numbers. You are taking my RSS feed and caching it to serve more efficiently, which would be fine if only you told me how many times you are doing that. But youâre not.
Randy: I love my Dell laptop. It's my second Dell laptop. I gave the first one to my wife. It's still working many years and dozens of hardware failures later. Their on-site at home customer support is amazing. I love Dell servers too. I use to run a Website with a farm of them on a baker's rack.
I think some people open their mouth, just to catch flies. Yes Jeff!
Hello Memeorandum. I wonder how much crap I have to write to get myself on the top entry. The top article is currently Jeff's latest rant about Web aggregators caching the RSS and spoiling his FeedBurner reader counts. I wonder if Jeff knows that Web proxies around the Web have been caching some of his Webpages for years.
What I'm finding is that people are writing long winded and absolutely boring blog entries like this one, in order to game Memeorandum. Not entirely aware of how the algorithm works, this is my first experiment in gaming Memeorandum. I don't know how much more crap I can write.
BTW, if you think you might commit suicide in the next 20 yrs, then try birth control. My daughter has too many fatherless friends already.
Update: The answer was about 5 minutes.
Note: Please sign in the comments if you read this. It's a study! All signers will be given an unlimited supply of dihydroxide.
Of late, I'm receiving an enormous amount of comment SPAM that links to splogs on Blogspot.com. Unfortunately, this means that I'm gonna have to add ".blogspot.com" to my kill file. This will prevent any new comments with links to blogspot.com blogs and delete any existing comments with links to blogspot.com. Even legitimate comments. This will apply across all blogs hosted on kbcafe.com.
Danny Ayers: People care about applications, but if HTTP and HTML were so badly specified that no-one had a clear idea how they should be implemented, we wouldnât have a Web.
AttentionTech: This week OPML got a big boost following a commitment from Dave Winer, the developer of RSS and OPML, to include support for namespaces that will make it easier for people using an RSS feeds to uses OPML to export and import not only their feeds, but their personal rankings of those feeds across different applications and devices. Joining us today to explain why this is important to the future of the Web is Nick Bradbury, architect of the Feed Demon news aggregator that is now owned by Newsgator.
Scott Johnson: One of the things we saw as we analyzed the data is that blogs from Weblogs, Inc has a striking showing in all this. [cut] They show up strongly in the Top 500 because the way they chose to model their content is very similar to ours.
Scott Johnson: We had to pull [cut] features at the very last minute due to some qa issues - [cut] OPML output [cut].
Randy: That was my favorite features :-(
Update: I found an OPML of the Feedster top 500. OPML heaven.
Tim Bray: I made a graph of how often the ongoing feeds have been fetched so far this year, and the popularity of RSS vs. Atom 1.0.
Dick Costolo: Back in October of 2003, when we first started building FeedBurner with hammer and chisel, RSS was, for many people, synonymous with blogs. [cut] Today, however, while almost all blogs still have feeds, there are innumerable feeds that are unrelated to blogs. Commercial publishers have embraced feeds wholeheartedly; most web services and many search engines now provide subscribed results; and podcasts and videocasts are entirely feed-based while not necessarily tied to blogs.
Randy: A great essay on the state of feeds by Dick Costolo. I guess you could call this the FeedBurner State of Feeds address to the Webosphere (no longer just the blogosphere).
Feedster just released their blog rankings for November/December. A pleasant surprise at #20.
Rich Lafferty shows us how to include a Google blog search widget for any blog.
<form method="get" action="http://blogsearch.google.com/blogsearch">
<input type="text" name="as_q" size="31" maxlength="255" value="" />
<input type="hidden" name="bl_url" value="http://www.kbcafe.com/rss" />
<input type="submit" name="btnG" value="Search The RSS Blog" />
Replace the references to The RSS Blog with your own.
I experimented with uploading RSS to Google Base and found it quite easy. Following Niall Kennedy's instructions, I first created a Webpage that enhanced RSS with Google Base's RSS extensions using an XSLT.
Then I began uploading, which is a very manual task of naming and selecting files. I encountered a Beta limitation after uploading my 10th monthly archive (10 bulk uploads max). Later, I was able to search and discover these blog entries at Google Base. Next, I plan to remove what I've already uploaded and start anew, but with much larger RSS files (working around the max 10 issue). Hopefully, I can get the majority of my blog entries indexed tomorrow.
Niall Kennedy: Create a new feed containing additional elements from the "news and articles" information type. These additional elements include author name, tags and categories (label), and a publication date. I set the number of pages to 1 because all my posts exist on their own individual web page.
Randy: This is a great way to get more referrer hits. Put all your blog entries in Google Base. I'll give it a test try.
Note: I would avoid using the obsolete Atom 0.3 format. Stick with RSS 2.0 until Google upgrades to Atom 1.0, because you know RSS 2.0 won't get deprecated.
David Sifry: Since beginning our infrastructure improvements, Technorati's uptime has improved significantly. [cut] According to GrabPerf, even while our overall traffic has increased, our response times have consistently decreased. [cut] The index is over 3 years old, currently 21.5 million blog posts. [cut] Our median time to index is now under 3 minutes from the moment a blog post is created.
Randy: Congrats to the Technorati team. I have to agree, Technorati as been a lot faster this last month. But, Technorati is returning mostly Blogspot splog referrers now. They need some severe splog spotting or maybe just remove all Blogspot posts from their index. I'm pretty sure David meant 21.5 million blogs, not blog posts. BTW, my own calculations indicate we are now beyond 100 million blogs, so Technorati is only indexing about 20% of them. Last, David's claim of a median time to index under 3 minutes is beyond ridiculous. For instance, the Bad Politics blog is updated several times per day and pings Technorati each time. Yet, Technorati hasn't indexed the blog in 3 days. iBLOGthere4iM pings Technorati several times per day, but hasn't been indexed in 20 days. The RSS Blog pings Technorati several times per day, but hasn't been indexed in 3 days. The Besting Adwords blog pings Technorati several times per day, but hasn't been indexed in 3 days.
Randy: I suspect Wordpress will quickly become one of the most popular blogging platforms and later splogging platforms. Matt, please quickly hire scalability and splog detection expertsBTW, Wordpress doesn't have a login link on their hompage, a big usability gap.
Randy: Wow, it's amazing how many errors Sam found. Had they tossed that document at Sam, Danny or myself prior to the release, I'm certain most would have been easily fixed.
RSS Blog reader: Is it difficult to place the xml feed of your del.icio.us posts on your blog? I noticed youâve been doing this for a while and Iâm intrigued.
Randy: There are more than a few ways of getting your del.icio.us links in your RSS feed.
Robyn of the GamingAndTech blog has an podcast interview (podcasterview?) with Sphere.com CEO Tony Conrad.
Dave Winer: I think OPML should have a way to include elements from other namespaces [cut] so I plan to include this advice in the new OPML spec.
Randy: Awesome! A new version of OPML is in the works. Clarifications? Extensions? Point Dave to samples of extensions you would like to see. BTW, OPML is already very extensible, as you can include pretty much any attribute in the outline element. But, it's true that clarification of its extensibility points would be a community win.
I personally maintain a lot of OPML file where I embed the RSS associated with the <outline> within the element. I use PASS to include the RSS within a namespace. Sample...
Coming to a blogosphere near you.
With the launch of Google Base, Google has created a new RSS extension. The RSS extensions facilitates bulk uploads of classifies to Google Base and supports RSS 1.0, RSS 2.0 and Atom 0.3. I'm unsure why they didn't do Atom 1.0. Could've been a timing thing. Their development complete date may have been too near the Atom 1.0 finalization date. Yet another reason to have gotten Atom 1.0 done earlier (is anybody using it yet?).
Dave Winer asks "Where's the API?" It's RSS Dave. Remember, you helped create it.
Following up on my article on the State of the Blogosphere, Mark Cuban sent me the following additional comments that I thought should be shared.
Thx again, a couple of quick notes..
Any feedback or ideas are always welcome. Feel free to post this ! Thx for all you have done in this space
- We do allow blogspot blogs, but have very specific rules.
- We no longer worry about how quickly we convert from ping to result. Its more important to scrub out the splogs.
- We will be adding features, bu I think its important to note that we aren't trying to be a blogosphere portal. We are a tracking tool that lets users search based on freshness. Not relevance. We aren't trying to help you find the best blog to read, we are hear to help you find what people are talking about, whther its a blog,a discussion forum or other resource..
And thank you Mark for kicking off the fight against splogs.
It's been awhile since I reported on blogosophere search. This is mostly because it's not getting any better, with a few exceptions. The problems mainly arise from the broken blogosphere ping infrastructure and the unyielding supply of splogs. Let's start with blogosphere ping and move onto splogs.
I've tried manually pinging Technorati and the like. I've tried using Ping-o-Matic. I've written my own ping agents. Currently, I'm using FeedBurner's PingShot to ping Technorati, My Yahoo!, PubSub, Ping-o-Matic, NewsGator, Feedster, IceRocket, Weblogs.com and Blogdigger. The results are not great. According to The RSS Blog profile in Technorati, it hasn't been updated in 3 days. My other blogs have similar issues, iBLOGthere4iM hasn't been updated in 13 days, Besting Adwords in 4 days, Destroy all Malware in 13 days, Bad Politics in 28 days, R-mail in 28 days, Game Certainty in 13 days. Yet, everyone of these blogs is updated quite regularly (many several times per day). Now, Technorati is by far the worst of the ping sinks, but others are experiencing very similar. According to IceRocket, iBLOGthere4iM hasn't been update in 3 days, The RSS Blog hasn't been updated in 24 hours and so-on. PubSub reports 0 posts on my Website most days, yet I post about 20 times per day and others posts on KBCafe.com another 10 times per day. Google blog search hasn't updated it's cache of iBLOGthere4iM in 3 days or it's cache of Besting Adwords in 3 days. I don't want to leave anyone out, so I'll continue. Blogdigger hasn't updated it's cache of The RSS Blog in 3 days, BlogPulse hasn't updated it's cache of The RSS Blog in 2 days. I honestly couldn't even find a recent post in Feedster from the KBCafe.com domain. Why do we bother pinging? They surely don't actually do much with the pings. All of these sources would do better to simply poll our feeds once in awhile and stop all these ignored pings. Personally, I've decided that redundantly pinging them is now acceptable game. Anyhow, onto a second rant.
Splogs are killing me. Particularly from two domains; Weblog.ro and Blogspot.com. I'm not sure if Weblog.ro is a legitimate Romanian blog hosting service, but I can't believe how many invalid trackbacks, blog comment spams, referrers I'm getting from there. I've blocked it completely and most of the search engines have too, from what I can tell. Blogspot is definitely legitimate, but it's presence as the #1 hosting service makes it the primary target of sploggers. I don't know why Google is struggling so much with banning sploggers, but I'm at the point where I prefer search engines that ignore all referrers from Blogspot. Funny thing, Google blog search seems to filter out the Blogspot splogs better than most. The best, seems to be IceRocket, who I assume have completely banned new Blogspot blogs from getting into their index.
Anyhow, with the major subjects addressed, let's move onto evaluating how the individual blogosphere search engines are performing. I'll start with the ones I like the best and move onto the ones I don't think work at all.
The #1 blogosophere search engine IMHO is IceRocket. It responds in about 1 second. It filters out splogs better than all else. Its index is reasonably current and large. It has link search, keyword search, tag search. The results are available via RSS. What else can you say?
Google Blog Search
The #2 blogosphere search engine IMHO is Google Blog Search. It responds faster than IceRocket, but 1 second or 1/3 second doesn't really affect my overall experience. It filters out splogs better than most. The index seems to be more current, but not as large. It has link and keyword search, but lacks tag search. The results are available via RSS. Very good!
BlogPulse has gone thru the best improvement of late. The response times are much better, although it can be somewhat slow at times. It filters splogs well. The index is somewhat current, but not very large. It has link and keyword search, but lacks tag search. The results are available via RSS. The BlogPulse profiles are somewhat cool! Very good!
Technorati is the biggest disappointment this month. The response times are usually pretty good. It doesn't filter splogs at all, from what I can tell. I usually use Technorati to find new splogs to report to Matt Cutts at Google. Technorati often responds with new results that are actually days, weeks and sometimes months old. One of the most common emails that I get is users asking how they can get indexed by Technorati. Honestly, I don't know. I've setup profiles for all my blogs and ping Technorati regularly (sometimes even manually) and I don't know what works and what doesn't. Effectively, Technorati is mostly broken. Lot of great features in there if it actually worked.
Blogslines is actually a pretty good link search tool when it works. Unfortunately, the two most common responses are errors. The third most common response is no response at all. Even when it does respond with positive results, so much time has elapsed that I forget what I was looking for in the first place. Blogslines is not mostly broken. It is broken.
PubSub is infested with mostly splogs and it's very hard to find anything useful anymore. Beyond broken.
Yahoo! Blog Search
A promising future star. It currently doesn't have a homepage and it's really hard to use. I'd point at it, but there's nothing really to point to yet.
Onfolio Support: When I tried to access feeds last night, every single one of them gave me the red x icon, even the feed for this support forum. [later] I opened IE to see what was up with that program, and it told me that it was working "offline". When I told it to work online, it showed me that I had a proxy set up, and the proxy was not working properly (since I'm back at home). When I had checked the proxy information in Firefox, all was correct, i.e., no proxy.
Randy: This is very common problem across most-all of the native Windows RSS readers. It's not exclusive to Onfolio. I run across it a few times per year in support forums. I just wanted to save it for easy reference in the future and just maybe somebody will Google it and end up here. The problem happens when you move your computer between network and use a Web proxy in one of the two networks. The second network may not have access to the Web proxy and your RSS reader may be using those proxy settings, thus failing. Disabling the proxy settings removes the problem and you may have to re-enable the proxy settings when you return to the other network.
This morning was another bad day for Blogger.com and Google. I found 57 splog referrers from BlogSpot and forwarded them to Matt at Google. Basically, Google doesn't seem to be capable of stopping the splogs. IMHO, the recent splogs on Blogspot have easily identified patterns that make them easy to pick off. How Google is not able to identify and remove them is beyond me? IceRocket already has kicked them out of their index. I now have to agree with Mark Cuban and Blake Rhodes. This is now the only way that a Blogosphere search engine can move forward with reasonably clean results. I also note that Google doesn't show these blogs in Google's blog search, but haven't disabled the accounts. That seems like dirty pool to me. If you host on Blogspot, then don't expect me to point to you or find you anymore. BlogSpot is Dead!
I'll assume the reader does not have a FeedBurner account. If you already have an account, then skip to step 6. If you already have a FeedBurner feed, then skip to step 11.
FeedBurner will automagically check your feed every 30 minutes and when it changes, it'll automagically ping the ping sinks you selected. You know longer have to do that yourself. BTW, Ping-o-matic is also a ping source and it will ping several other ping sinks, maybe even redundantly.
Dave Winer: The thing that makes podcasting special is that it is accessible to everyone. [cut] Basically MP3 can't be rigged up to serve the purpose of advertisers, and that's why I love MP3. And only MP3 provides the portability and compatibility that users depend on. Any other method will force them to jump through hoops that they will resist.
Randy: MP3 is not the only road to accessibility. My rules of thumb is simple. It's gotta work (audio or video) in a typical installation of XP and OS X. MP3 does that? What other formats work? Not Real media. Not Quicktime. Not MPEG-4. Not WMV. If you are picking one of these formats, then you are limiting your audience. If your audience is OS X only, then Quicktime is likely OK. If your audience is XP only, then WMV is likely OK. MHO.
Randy: I suspect the next 12 months are gonna prove difficult for Outlook plug-in based RSS readers. BTW, Alexander has lots of screenshots.
Mark Evans: It's probably my fault given my enthusiasm to "apply" for new Web 2.0 services but it's getting out of hand. In the past few weeks, I've tried out Sphere, Flock, Tailrank, Rollyo, SearchFox, Wordpress, Slawsome and Remember the Milk. They're all interesting and some of them are even useful but I feel like I'm at an all-you-can-eat buffet and my appetite is disappearing.
Kailash Nadh: Pingoat is dying. The server is choking to death. Please immediately contact me at email@example.com if you can offer help in anyway. It wouldnt be more than a few days before the whole server collapses :| So please do offer a helping hand if possible. URGENThttp://www.pingoat.com/goatlog/post/index/35/URGENT-Help-needed
Randy: Although Kailash isn't offering details, I would assume that Sploggers are pounding his server into submission. Pingoat joins Ping-o-matic in the valley of dead blogosphere services. Let's just hope Kailash's SplogSpot doesn't meet the same end. I use to run a blog ping service called Blogomatic, but it didn't take ping requests, rather it would passively check for updates and ping the blogosphere when changes occurred. FeedBurner now provides a better version of this same service called PingShot.
Randy: Great news! Now, if they can work on selecting themes that don't burn my eyes ;-) BTW, this is not good news for Atom API, which seems to be stuck in committee. The original milestone for Atom API was April 2005. Had they hit that date, would MSN be implementing MetaWeblogAPI? There's still action on the mailing list and wiki, but the Atom news has slowed considerably.
RSS Blog reader: I use sharpreader as my rss reader and tonight noticed that it has started treating one of my blogs differently to the rest (ie mine...@$@$#%!). Whereas it used to import and display each full post with pictures, formatting and links, it now only reads the first 40 words or so and strips all the rest (including formatting, piccies etc). My friend also runs a blog on blogger.com yet HIS feeds are still coming through beautifully. WHY oh WHY would blogger.com or sharpreader suddenly start treating the randomselections posts in this way?
Randy: Most likely your blog was set to partial content. You can change it back with the following instructions. Login to blogger.com, goto Settings | Site Feed and set Description to Full.
I've received dozen of requests to re-enable the chicklet generator. I figured, instead of re-enabling it on a case-by-case basis, I would simply create a version that cannot be abused by sploggers and the like. So, I created a brand new version of the chicklet generator, with no bandwidth requirements or restrictions on use. Hope you like it!
CNN: Chances are, if she weren't the co-founder of a successful Web log publishing company (Six Apart), her Web log probably wouldn't get much press.
Randy: Agree, if she didn't create Six Apart, then few Web logs would be getting any press and we'd still be asking "What's a Blog?"
Dave Winer has proposed a new addition to the OPML vocabulary. A new OPML nodetype to include one OPML document within another.
<outline text="Florida" type="include" url="http://hosting.opml.org/dave/florida.opml"/>
I nominated 5 blogs. No particular order, they are just the ones I link to a lot and read everyday.
Randy: This shows great community leadership on Microsoft's part. It also means that some Windows-based RSS readers will refuse to use the RSS platform in Vista and claim their RSS aggregator accepts more feeds. When that happens I'm gonna be ready to pounce.
Robert Scoble: We CAN NOT chase Googleâs tailpipes.
Feedster (August): Each month, Feedster brings you a list of 500 of the most interesting and important blogs.
Scott Johnson (October): Why do you think that the Feedster 500 is late? Its not that we're lazy; its that the bulk of my time over the past two months has gone not into the next rev of the Top 500 but into dealing with spam.
Randy: November 2nd.
It appears that some of the sploggers have caught onto the chicklet generator and are abusing it. I mean who would subscribe to a splog anyway? The end result is that this is not gonna scale and I'm gonna have to reconsider making this available publicly. Any insight would be appreciated.
Can you say branding exercise? Anyhow, lots to check out.
Oops! I give up. Microsoft now insists that I'm French (ok, maybe I am) and isn't giving me an option to switch back to English. Goodbye! BTW, they offer a smaller subset of the features to lang="fr" people. Maybe Microsoft should rename this buggy.com. I'm kinda getting tired of this whole beta thing.
Update: I checked out Live Safety Center. That was actually pretty good and could be very useful to the mundane user who doesn't already have Virus protection. In fact, I'm using AVG which is malware that scans for malware. Live Safety Center means I can uninstalled AVG tonight. Now, I don't have to worry about AVG thrashing my harddrive every morning.
Randy: Good evaluation of RSS readers and their implementation of Conditional-GET.
Today was the big move. R|mail is no longer a service offered on the KBCafe Website, or at least not for long. I put $400 where my mouth is and bought the r-mail.org domain for 10 years plus one year prepaid hosting. Today and tomorrow, I'll be modifying all the R|mail pages on KBCafe and forward them to our new homepage. I moved the blog, but the FeedBurner feed remains the same. Email me immediately if you encounter any issues during the move. This will allow me to offer more services without overloading the KBCafe domain, which if you haven't noticed has been less than responsive. I've got a lot of plans and I need way of paying for them. As such, I'm selling links on the r-mail.org homepage for $1/d. I've already got one sponsored link from the RRBBS Discussion Group and in their first week, they've acquired 8 R-mail Subscribers. Send me an email, if you want in on this deal.