RSS, OPML and the XML platform.
Copyright 2003-5 Randy Charles Morin
Tech.Life.Blogged: I hacked up my Google Search Bookmarklet and changed the code to work with OPML Surfer. It helps stop all that messy copy and pasting. When you have an OPML page open in your browser you just click the 'Surf this OPML' link and a new window will open up containing the OPML in browsable form. Magical!
First, you need to drag this link to your browser's tool bar:
>>> Surf this OPML <<<
Now go to an OPML file like: http://hosting.opml.org/techcrunch.opml.org/techcrunch.opml and then click your new 'Surf this OPML' bookmarklet.
Randy: Note, in IE, you may have to right-click on the link, select Add to Favorites and choose the Links folder.
Earlier this week, I start another test of the blogosphere search engines. You can read my notes and where/how I got the results on that blog entry.
The final score is here and now.
The winner by a long shot is Bloglines. Very slow and down at least once per day, but it's simply better at finding links. Disappointing is the results from Google. It seems, my blogs are well indexed by Google, but most others are not.
I think I'm gonna make this a weekly feature starting every Monday or Tuesday and running thru to Thursday, Friday or the weekend.
I just installed RSS Xpress. It's a free RSS reader. Classic 3-pane. A lot of functionality. Import via OPML worked great. There does appear to be a lot of bugs. Not sure what makes it different from many other 3-pane RSS readers.
Bug enumeration follows.
Technorati Weblog: We've been making a lot of updates and tweaks to our backend, as described at the beginning of the month. Most of these changes have been around speed and performance, and I hope that you've noticed the difference.
Randy: Yes, I've noticed a different. Not only is Technorati competing in my blogosphere search tests, they were one link away from the #1 ranking (Bloglines still winning).
James Robertson: Really crappy. I had massive headaches implementing OPML support for import/export in BottomFeeder. Why? Because there's no real specification. Like everything Dave Winer has ever been involved with, the specs are all in his head, and it's up to the rest of us to figure out wtf he actually meant. Here's the "spec" - and look at all the meaningless crap in it.
Randy: I gotta agree that OPML like RSS is very loosely specified. Something James fails to point out is that the spec he references is for version 1.0, but the most common used version is 1.1. Where's the spec for version 1.1? It doesn't exist. That said, like RSS, OPML is really simple. If you're having a hard time implement stuff in it, then you're working too hard.
Kailash Nadh: Feedburner launched a pinging service named "PingShot" recently. Its supposed to ping to a number of services, just like Pingoat does, every time you update you feed. But odd enough, they have included certain Ping services like Pingoat, that again ping the same sites Pingshot pings individually.
Randy: PingShot, Pingoat and Pingomatic are blogosphere ping sources, which I'll define as a ping source that pings ping sinks. MHO, one ping source should never ping another ping source or accept a ping from another ping source. Of course, you could devise an algorithm to minimize the damage of redundant pings, but wouldn't it be simpler if you just followed rule #406.
Rule #406: One ping source should never ping another ping source or accept a ping from another ping source.
Earlier this month and before Google launch its blog search, I documented all blog search engines capability of finding new links. When Google blog search was released, I was amazed how much better it was then the other engines. Since then, Technorati has improved, IceRocket has improved and maybe now we have a four-way challenge with Bloglines (the best last month) and Google (the newcomer). Let's start it again. For any links discovered on posts tomorrow or later, I will document and aggregate the results to see who's the best blogosphere search engine, this month.
Enumeration of competitors
Am I missing anybody? Splogs or self-referential blogs will not be included in the results.
I've always struggled using Technorati. It seems that my blog entries are rarely indexed and blogs that point to me are usually indexed days, if not weeks after the post. But, it seems that everytime my name appears on Memeorandum, I get those results in zero time. We've all heard the rumours that Technorati is pinged first, but is Technorati also indexing blogs based on a preference for high profile blogs? Judging by the ethusiasm of some a-listers, I'm pretty certain I know the answer.
Rok Hrastnik: If as a marketer you're trying to generate RSS subscribers, simply using an RSS subscribe button is the worst way to go for you and for the RSS industry as a whole as well.
Randy: Roy provides alternatives for acquiring RSS subscribers. If you follow Roy's advice, then you'll get yourself a couple extra subscribers, but why no go the whole nine years. Take a look at the right sidebar of The RSS Blog. I provide eight chicklets from the most popular blog aggregators and an easy way for users to simply subscribe via email, not to mention a (What is this?) link immediately beside my RSS chicklet. Don't cut corners, you need all this stuff to truly generate subscribers. If you end up with a user-friendly XSLTed feed like your typical Blogger.com blog, then you'll end up with less subscribers, not more.
It's time again for a list of top subscribing RSS clients to my FeedBurner hosted RSS feed.
The list really didn't change must at all. Seems to be moving into a stable state. I might decrease the frequency of these posts. Boring!
Robert Scoble: Yesterday Dave Winer noted that Yahoo's RSS aggregator lets you export your list of feeds to OPML which is really great cause you can then try out other aggregators. [cut] So, yesterday, I was interviewing Sanaz who is one of the people who works on the start.com team [cut]. I asked "why don't you do what Yahoo does?" She answered "we already do." Yeah, Start.com lets you both import and export OPML lists of your RSS feeds.
Randy: The best part is that you can now manage your subscriptions at My Yahoo or start.com and simply export them for use with various readers. Now +My Yahoo chicklets have value.
Danny Ayers: Remember that little orange [XML] icon, which when you clicked on it took you to a page of raw XML? Thing of the past. From now on itâll say [Subscribe], and "When you click on it you should subscribe to whatever it is youâre looking at."
Randy: If Danny and David can agree on this, then I'm buying me some flying pigs.
Randy: It's not really river of news, but it's a lot closer than Bloglines straight up.
I report any splogs with Adsense ads to Google. I do this by clicking on the Ads by Goooogle link in the offending ad unit and filling out the form. Google has responded by email with a better way of reporting Adsense splogs.
Google Adsense Team: In the future, to allow us to investigate any issues more efficiently, you can email our specialists directly at firstname.lastname@example.org.
ClickZ: PubSub is today expected to unleash a new site ranking tool, called LinkRanks, that measures the "strength, persistence, and vitality" of links pointing to and from a given Web site. Additionally, PubSub has begun an effort to compile lists of influential Weblogs by category, which could be of use to media buyers and planners eager to buy advertising in blogs.
Randy: Weird, they've had LinkRanks running for over a month. Not sure what has changed? They do have an OPML top 1000 list. Which I'll add to KBCafe profiles tonight. BTW, on the weekend, I added MyFeedster's top 100.
Update: KBCafe is ranked 843.
Kevin Burton: If you're a blog pinging service I'd love to get your ping data.
Randy: I wonder what that is?
About: We're introducing the world's most powerful and personalized weblog ranking system (at least we think so). [cut] If you'd like to keep track of TailRank's progress you can always read our blog. We're also hanging out on #tailrank on irc.freenode.net if you want to chat.
Randy: My immediate reaction was "How did that get there?" Here's how. news.google.com is a blog. Yesterday, I didn't think it was. Today, I know it is. Did you know it has an RSS feed? Is it the unedited voice of a person? No, but I never liked that definition. It kinda feels like a community link blog. Autogenerated. No different than memeorandum. I notice The RSS Blog find itself on memorandum once or twice per day? Anybody know how they pick the discussion blogs? Oh, that's how. Link to me. Memeorandum is a blog and has a blog. Subscribed.
These guys are great! I just noticed FeedBurner has a new chart in their analysis tools called the Red Cross Ad Provider Ad Performance chart. It tracks impressions and clickthrus. Kind of gives you a good feeling seeing that The RSS Blog got 65 clickthrus for Red Cross donations. Do you need another reason to get burned (ala FeedBurner)?
Dick Costolo: If you use FeedBurner, we now provide a service that notifies aggregators, search engines, and directories about your content updates as quickly as possible. When you use the new PingShot service, we will 'ping' a collection of services that you choose whenever we detect that your feed has new content. It's that simple. Select the new PingShot service for your feed from the "Publicize" tab, and you're done.
Randy: This pretty much does everything blogomatic does. I'm gonna shut it down and refer everybody to FeedBurner. BTW, you should be able to burn a feed at FeedBurner for free and not publicize the FeedBurner feed URL and take sole advance of this one feature.
BusinessWeek: The long-awaited tool is fast and sorts by relevance. But it misses results and lets in too much spam.
Randy: WTF? Technorati and Google. 13 hours since Technorati found an article linking to BusinessWeek's blog. Google has found 5 in the last 13 hours. One is a splog. Three are not extremely current. Results will change. But, my feeling is that Google is finding more results. I'm seeing more and more of these supposed tests showing that Technorati is winning. But when I run the tests myself, the opposite seems to be true. Although, not perfect, I'd put Bloglines and IceRocket ahead of Technorati, behind Google.
Two days ago, I fired off an email to Jason Calacanis about the new KBCafe profiles project I'm working on. Jason noticed that I was linking to an Engadget feed hosted on FeedBurner. Jason was righfully concerned that he no longer had control of his RSS feed.
I CCed the good folks at FeedBurner into the conversation and next thing you know we're joined by a couple more a-listers. Long story short, it would seem that somebody unrelated to Weblogsinc.com set up a feed for Engadget on FeedBurner and set the feed loose on the world. I think the feed had 5 figures of subscribers. FeedBurner has since redirected that feed back to Engadget's true feed. I sourced the feed URL from Feedster's top 500 OPML file.
I then decided that getting the correct feed URL had to be a priority for KBCafe profiles, so I rewrote the engine only to source blog homepages and to auto-discover the RSS feed. I then deleted the entire database and repopulated from scratch. As a result, I've introduced some new bugs, which John Roberts has helped uncover.
Bob Wyman: Computerworld has posted a special report on the winners of their first annual "Horizon Awards." PubSub.com is one of the winners and they profile us in their story: "PubSub Concepts' Prospective Search Search Tool for Tomorrow".
Randy: Congrats to Bob and team! Deserved!
Dick Costolo: As you can see from the front page of the FeedBurner Web site, we now manage over 100,000 feeds.
Try to keep it quiet until it's more stable. Lot's of bugs.
Scott Schecter: Until recently I had been using Sage for my news reader of choice. I tried Pluck the other day and I have converted. Pluck is really a well written piece of software and will work with Internet Explorer and Firefox for both PC and Mac. However, probably my favorite feature is how it allows you to access and sync all your news from whatever PC you happen to be at.
Leland Rucker of NewsGator sings "The RSS Blues."
James Robertson: What's new? Here's the list I posted:
Scott Isaacs: For defining manifests, we basically reused the RSS schema. This format decision was driven by the fact we already have a parser in Start.com's application model for RSS, there is broad familiarity with the RSS format, and I personally did not want to invent yet another schema :-).
Randy: RSS stands for Really Simple Syndication. The purpose is syndication. Are we syndicating gadgets here. No! Are we trying to get the gadgets to work in an RSS reader. No! What happens, if I import these RSS feeds in my RSS reader? They tell me the feeds invalid? It seems that RSS was re-used, in this case, for the purpose of re-using RSS and nothing else. I don't get it!
Update: Phil Ringnalda also has some reservations.
PR: The Pluck InSite suite of solutions offers interactive blogging and industry leading RSS capabilities as fully-managed, hosted services ready to be embedded into any web site. [cut] Key components of the Pluck InSite Suite include:
Mark Cuban: Google has done a great job of putting together a blog search engine. Its exactly what you would expect from Google. Fast and simple. But there are a lot of things it does and doesnt do, that when you add them all up, will result in increased traffic for Icerocket.com
Randy: Google's entrance into blog search means more traffic for everybody in the blogosphere, more hits for blogs and more interest in blog related Websites, like IceRocket. I'm carefully watching my traffic. Wednesday, the same day they released Google blog search, was my best traffic day in nearly two months. But Tuesday was also my best traffic day in nearly two months and within 1000 pageview of Wednesday. Thursday dropped off to a level below that of Tuesday. Overall, I don't see any significant traffic changes yet. Yet!
Update: I broke down the numbers and noticed that I was getting more hits on less popular blogs (+25%) and about the same on my more popular blogs. Of course, the popular blogs have higher overall hits counts that make the increase in the less popular blogs insignificant on the aggregate numbers.
Odeo: The Odeo Player is an Apple Widget that makes it easier to listen to your favorite shows. If you have an Odeo account, the Player gives you quick and easy access to the shows in your Queue.
AOL: A new AOL blog survey shows most bloggers are not aspiring "cyber journalists" or political activists; they blog as a form of therapy.
Randy: Since I started blogging, I stopped growing new gray hairs. My older brother and my younger sister have continued to gray. I blame it on blog therapy.
Kevin Burton: It seems that Google Blogsearch only has about 8M posts. [cut] If you break it out to posts per day with an 8M document index size this is about 106k posts per day. This is 160% smaller than Technorati's current index rate. These numbers might be wrong though. I'm not sure if my 8M index size number is correct. I'm also not sure if Google is removing more duplicate posts or spam posts.
Geek News Central: After 6 hours of being back online here are the preliminary results.
This is after I have had the mt-tb.cgi file removed from the server for over 60 days!
Randy: When I get a trackback, an agent loads the referring page and checks that it actually has a link to the trackbacked blog entry. This eliminates 99.99% of that trackback SPAM. I got 380 trackback pings yesterday (mostly all, if not all trackback SPAM).
It's time again for a list of top subscribing RSS clients to my FeedBurner hosted RSS feed.
I cut-off the list today at minimum 10 subscribers. I should note that I redirected people who were on my old feed, to the new feed. This seems to have helped Bloglines, NewsGator and Rojo the most. Google Desktop dropped considerably, after starting in the 6 slot its launch week. In other words, a lot of people tried Google Desktop and have since stop using it.
Randy: After one day, it seems pretty obvious to me that Google blog search is the blog search we always wanted. I find only two other blog search engines are even approaching Google; Bloglines and IceRocket. The results from Technorati, Feedster, PubSub, BlogPulse, etc. seem completely inadequate now that we finally have a blog search engine that works.
JoÃ«l CÃ©rÃ©: A dedicated blog search could prompt Google to remove blogs from its main search index, thus "improving" the quality of its search results. This speculation was based on Google removing Usenet postings from search results after acquiring Deja.com. [via Inside Google]
Q: How do you report a splog with Adsense to Google?
Randy: When you find a splog (a SPAM blog) that has Adsense on it, then...
I've been playing with Google blog search all morning. Here's my evaluation.
Kicks ass! Technorati and the others have always struggled with response times that are not acceptable. Google blog search always responds sub-second.
Kicks ass! Technorati and the others often don't pick up links until the next day and Technorati sometimes a month later. Google blog search seems to pick up every link within one hour.
Kicks ass! Technorati and the others pick up less than 70% of the links each (and sometimes much less). Google blog search seems to pick up every link (100%).
Game over! There's still tag search where Technorati is the leader, but I suspect Google will have a tag search shortly that will kick ass. If Technorati want to hang onto that pie, then I suggest Dave Sifry get to work on making a tag search that is responsive, timely and has full coverage.
Update: A chink in the armor, Google has not report this link. Mind you, neither has anybody else. Google missed it because the RSS feed is partial content. Anybody want a new reason to publish full content feeds?
Randy: The giant rolled over. This is fast! It's backed up with both Atom and RSS feeds. Everybody else is dead.
Dave Winer: Speaking of Technorati, okay, Google's blog search isn't perfect, but now there's a benchmark to compare against. It sure performs well. [cut] Neither company is much-loved in the community these days, but in balance I trust Google more, which is pretty amazing, considering how small Technorati is. We knew the day would come when Technorati would have to compete with Google. They could have prepared much better for this day, imho.
Anil Dash: First, the new Blog Search works. All the basic functions you'd expect from Google search results are present, including ranking results by date or by relevance. (Interestingly, the default is by relevance, like other Google searches, instead of by date, which is the default for most blog displays.) But more importantly, the advanced search offers powerful functionality such as searching by date ranges and limiting to individual blog authors, in addition to features like searching for words in a blog post title or by language, which have been deployed in the past on other services.
Blogfresh: So some of the "zippy, side-scrolling goodness wasn't working too well for me in Firefox this morning, but that's OK. Here's another search tool for us to enjoy. There's a pretty darn groovy feature I wanted to highlight. When your search results (for a keyword, say) display a post that has inbound links, you'll see a "references" link next to the name of the author, & you can jump straight to the cosmos for that post. How long 'til supported tagging? Soon, I think?.....
Jason Calacanis: When Google puts blog search on the home page traffic to blogsâall blogsâwill double or triple. In the case of smaller blogs it might grow 100x.
Randy: Missing in the advanced search is an inbound link search. I wanna know who's linking to me. Although it's not in the Advanced Search options link:kbcafe.com seems to work.
One thing I don't like is the browser-friendly formatting of their Atom and RSS feeds. That's ugly.
NewsGator: NewsGator Technologies, Inc., the leading RSS platform company, today announced that the company is hosting a competition, open to developers around the world, to see who can develop the most cutting edge application using the NewsGator Online API that was recently made available to the public. Contest details include:
Randy: For some reason, I think I need a new laptop.
Scott A. Golder and Bernardo A. Huberman: In this paper we analyze the structure of collaborative tagging systems as well as their dynamical aspects.
funponsel: Since I found KBCafe/Tags tool, I usually spend my time there to find a good topic to blogged. To refresh your memory, KBCafe/Tags is a tags-based search engine which will search for del.icio.us, Flickr, and Technorati based on the tags you submitted.
Randy: Gotta love positive user feedback.
Michael Gartenberg: RSS support in Office is nice. Not only does the system track RSS via subscriptions in IE 7 (and has an RSS gadget built into the new Vista Sidebar) but Office supports RSS as well. Outlook finally adds integrated RSS support (which is my preferred way to read RSS). In general, it looks like RSS aggregators are indeed a commodity.
Piaras Kelly: Here are some tips on writing content for your blog:
Roy Hrastnik: Here are the absolute 101 basics you really shouldn't ignore ...
Randy: Cool, another blogging top
10 13 list. Roy's always got the best lists.
Tim Yang: I have a list called âReadersâ in my RSS feed reader, but itâs still very thin and Iâd like to boost it. So leave me a comment or write me an email. Thanks!
Randy: This is a great idea. If you are reading this blog regularly, then leave me a post or email once in awhile with a pointer back to your blog. Even, if it's just a "Hello, this is my blog."
Chicago Sun-Times: Nearly 70,000 publishers use FeedBurner, ranging from Chicago-based Coudal Partners, an advertising, design and interactive firm, to VNU Business Media in Europe, a global media company and parent of ACNielsen, Billboard and other major brands. [cut] FeedBurner has won the attention of venture capitalists, attracting more than $7 million in investment from Portage Venture Partners in Chicago, Mobius Venture Capital in Denver and Sutter Hill Ventures in Silicon Valley. [cut] FeedBurner, which does not disclose revenues, has 14 employees, up from five in March.
Randy: A great article on FeedBurner services and its history.
TechCrunch: I believe that if more companies approached us differently, a much higher percentage would be blogged.
Paul Thurrott: Here's how the product editions will break down [two excerpts follow]:
Robert Scoble: Orange XML icons makes a site seem more up-to-date. Sites with orange XML icons look cooler to me than sites that don't have orange XML icons.
Randy: I'm even drinking more orange juice than I use to. I think it helps me look cool!
Greg Reinacker: I was on a call the other day with some folks in the industry, and someone made a comment to the effect of "we really need to come up with some kind of solution for securing RSS feeds - then we can really do some cool stuff." Before I could get on my soapbox, someone else on the call concurred with the first person. When I mentioned that this stuff has been figured out already, and started describing the existing widely-used mechanisms, they were both a bit surprised, and suggested I write something about it.
Randy: From now on, when somebody asks about RSS security, I'm gonna point them here.
Today, I'm testing a long requested feature of R|mail. Sending multiple R|mails in the same hour. If you posted twice or more in an hour, then R|mail would only send the most recent post. I did that to prevent 821 bombing, that is, filling up your inbox with R|mail alerts. Today, I changed the algorithm to send up to two postings per hour (Atom 0.3 and RSS 2.0 only, RSS 1.0 is still limited to once per hour). Give me some feedback. If the tests go well and the requests for more R|mails continue, then I'll increase the limit to three.
WSJ: David Sifry, chief executive of Technorati, says his company gets an edge from exclusive deals in which some blog-hosting companies ping Technorati before anyone else. After receiving a heads-up, Technorati visits the blog and updates its database.
Bob Wyman: If these claims are true, then the WSJ has revealed behavior most foulâ¦ [cut] This is not a good thing.
Randy: The Internet routes around obstacles. Blogging clients can ping PubSub before they even post to LiveJournal. Businessmen that go around with a negative attitude will get pinged after businessmen what go around with a positive attitude. Stuff happens.
I'm checking my inbound links and I came across a Blogspot blog and clicked on it. Then it hit me. Where did all the blogspot splog inbound links go? This blogspot inbound link isn't SPAM. It's not a splog. It's legit. I started looking around and all those Blogspot splogs are 404. What happened? Some investigation and I came across a lot of blogs like this one. This splog was using Adsense to monetize itself and Google has cut them off.
Three cheers for Google! Hurray! Hurray! Hurray!
...has making the following two sentences very common across the Web.
"You need to be registered to comment on this site."
If I was a young entrepreneur, looking for my next big thing, then I'd be looking for solutions to this problem. I'm certain Google or Six Apart would pay for anonymous comments without the comment SPAM. For that matter, why don't Google or Six Apart do something that works and make it available to all?
Johannes Ernst: The brilliance of RSS 2.0 does not lie in all the features it has, because it has very few. Instead, it lies in how many features are left out, so it could be a trivially simple format, while remaining extremely powerful. [cut] We need a version of FOAF, the social networking format, that is like RSS 2.0 is to RSS 1.0. [cut] We need something much simpler, so FOAF-type information can explode like RSS has.
Randy: Johannes is correct. Further, we need a completed FOAF spec, not the half-baked FOAF spec we have today.
PR: Mitsui & Co., Ltd., today announced it has made an investment in Feedster.
Island Dave: Antisplog.net is a barebone product at current (it just launched days ago), and honestly, I'm not completely clear on what it does. I've used both the blog pointer mentioned here, and gotten it to return 1s and 0s a few times, as well as the bookmarklet, and gotten fairly accurate reads on the site's I've reported.
Brian Alvey: This afternoon we took another step and added those Red Cross corner buttons you see on all of our sites. [cut] You can add the same buttons to your own site.
Randy: Brian has instructions for adding some pretty cool graphics to your site that allow your users to donate to the Red Cross Katrina relief. Check out the top-right corner of many (maybe all, didn't check) Weblogsinc blogs for samples.
Randy: An awesome plug-in which enables one-click-subscription via RSS auto-discovery in Firefox by overriding Live Bookmarklets.
Tim Yang via email: Could I make a request? A couple weeks ago I was reading about your study on blog search tools. What did you make of it? I'm really interested to know which is the best and for what each of them is good for.
Randy: Tim was referring to my It's about Finding New Links study. Beyond the final numbers, this is my in-depth (or very shallow) analysis.
I found most of the links first with Bloglines. Bloglines is really the best at finding new links, except when it goes AWL for days at a time. PubSub finds a lot of links too, but 12 hours after I first find them with Bloglines. When Bloglines is experiencing those sick days, I turn to IceRocket for timely links (you can translate that to 2nd best). Blogpulse finds less real links and a lot of blog SPAM. Technorati finds few links these days and usually I find those links days after Bloglines, but sometimes weeks and even months. Feedster finds few real links and mostly blog SPAM. I didn't find anything at all with Blogdigger.
Dick Costolo: In response to requests from a number of publishers, we have added a service that enables publishers to splice into their feeds the following graphic and link for Red Cross donations in support of the Hurricane Katrina relief effort. The service is located on the "optimize" tab for your feed.
Randy: +10 million karma for FeedBurner.
Randy: Some results that include The RSS Blog; folksonomy #5, longhorn #2, adsense #3, screencast #1, myyahoo #1, friendster #1, feeddemon #1, gnomedex #7, blogpulse #1, pluck #1, odeo #2, greatnews #1, mediarss #1.
This is pretty cool! Congrats to the Technorati team #1 #2 #3 #4 #5