RSS, OPML and the XML platform.
Copyright 2003-5 Randy Charles Morin
Sam Ruby has introduced OPML validation into his FeedValidator. As usually, he's created a set of contrived samples to show the difference between the three validators; Dave's OPML Validator, my Really Simple Validator and the FeedValidator. The varying results concern me. The Really Simple Validator validates against the spec, as written. The other two validators validate part against the spec, part against Dave's guidelines and part against something unknown to me. The concern? The guidelines seem to introduce new constraints to the grammar and validating against an unknown seems wrong. Is this OPML 1.1 or 1.2 and where's the spec?
First, he complains about Molly Holzschalg's blog not having an obvious mechanism for subscription. Looking at Molly blog, she has autodiscovery. That's all you need. Any RSS tool should be able to figure out her RSS feed from any page on her blog. If the tool can't, then the tool is broken (ex. IE). I'd prefer that she provide a Subscribe chicklet or even a chicklet or even an array of chicklets and surely she should also have a (What is this?) link. But, she's got a Subscribe link and a simple search of her Webpage would (for subscribe or RSS) reveals it (Ctrl+F in both IE and Firefox). By the way, try search for RSS or subscribe on Scoble's blog. NADA! And What is this? NADA! MHO, it's easier to subscribe to Molly's blog than Scoble's. Scoble needs a mirror!
Second, he complains that his Wordpress RSS feed is funky. I'm not sure he understands what this funky stuff is about. Funky means you are using a non-RSS element, in place of an equivalent RSS element. Scoble seems to think that using a CDATA is somehow confusing and funky. Using a CDATA is one of two ways to XML encode your HTML within the content of an element. His old feed would escape the HTML, which is no better than CDATA. Those are two methods of accomplishing the same goal. In fact, CDATA is slightly superior because it makes the feed easier to read (by humans).
Next, Scoble makes a leap and says Matt is favoring Atom over RSS. I'm not the greatest fan of Atom, but on this point, he's simply wrong. I think he's spent too much time listening to Dave Winer.
Mark Evans: My first impression is it has a long way to go before it can seriously go after the established players.
Randy: I had a similar impression, Sphere didn't seem to be producing nearly the same quality of results as IceRocket and others.
Have you noticed that a lot of MSN Spaces bloggers are blogging the same thing? ;-)
|Space not available|
|This space is temporarily unavailable. Please try again later.|
James Huff: My main complaint with Google Reader is that it prefers to organize new posts by date or ârelevanceâ and not by feed.
Randy: To read by feed, click on Your Subscriptions, then click on the feed and itâll show you all the items for that feed. Click Show read items visible to see items youâve already read. It seems a little buggy, but it works for your style of reading.
James Huff: Unfortunately, I couldnât find an easy subscription button for Google Reader that matched the style of the rest of my easy subscription buttons, so I whipped this one up a few nights ago. Feel free to use it on your blog too.
Google Reader Blog: Earlier this week we pushed out a new release of Reader. Most of the changes are under the hood and should make for a faster, smoother experience. However, there were a few user interface tweaks too. My favorite is support for the space keyboard shortcut.
Randy: What? My favorite Web-based RSS reader is even better? Love the spacebar shortcut. Bye-bye j.
Update: After a few moments of use, I realize I like the j keyboard shortcut better than the spacebar.
One thing I don't mention enough is how awesome the appropriately named GreatNews RSS reader is. They have a new fan today.
Mark Berthelemy: I've found the best of both worlds; an offline RSS reader that sychronizes with Bloglines. It's called "GreatNews". It's a tiny download, fires up immediately, and has all the features of RSS Bandit that I used, but more (for example, instead of fixed "flags" to put against items, I can create user-defined "labels" to pull things together into groups that I want to come back to. And an item can have more than one label. Ideal for bringing stuff together for a research project.
Chris Nolan: I checked with Blake Rhodes and he says they are just running some tests. [cut] For me, I don't like them of course.
Randy: I like them! I hate free services. I like to know people are making some money. If they don't make money, then you can count on the free-bee ending eventually.
Ben Trott: This has been a bad month for TypePad's performance and general availability, and I'd like to talk about a number of the issues we've faced, how frustrated they make us, and what we're doing about them.
FeedBurner: If your overall feed circulation has dropped considerably in the past three days, it is likely that a big part of the drop is due to an interruption of subscriber reporting from My Yahoo. [cut] Please note that FeedBurner feeds are still completely accessible and viewable in My Yahoo. [cut] Our understanding is that this is just a bug on the Yahoo side and it will be fixed in the next couple of weeks.
The Newest Industry: The TypePad application is currently experiencing performance degradation.
Randy: This was actually very confusing for me. I expected to get results filtered with the tag I entered and was surprised to find the search was filtering on the title and description text. I guess del.icio.us would be limiting itself if it remained solely a folksonomy company, but this change makes del.icio.us less appealing. I guess I could type tag:rss instead of just rss, but do I really want to?
mnot: The trivial answer is that HTML only talks about GET and POST.
Randy: And that's the only answer. One day, a committee made a decision. Don't speculate otherwise. Simply lookup the members who made that decision and ask them why?
Google Reader can be used to convert any feed to Atom 1.0. Check out The RSS Blog in Atom 1.0.
Phil Wilson: Just use http://www.google.com/reader/atom/feed/ followed by the URL of your own feed.
Update: It broke again, it's now giving me the items out of sequence again. If this thing sorted by date, then I might abandon all other news reader. Maybe not all of them
I had almost forgotten about Google Reader, when I found a referrer link from The Official Google Reader Blog. Very cool! The blog features The RSS Blog in their blogroll. I'll have to take another look to see if Google Reader has improved. It was quite buggy on my first experience, but I could see the potential.
Wouldn't it be fun to just create a grid of search features and search engines and rank the engines either good, bad or none (not present) on each feature?
|Web search RSS||none||good||good||good||none||none||none|
|Blog search RSS||good||none||good||good||good||none||good|
|Tag search RSS||none||none||good||good||good||none||none|
|Link search RSS||good||bad||good||good||good||none||bad|
|News search RSS||good||good||good||good||none||none||none|
|Image search RSS||none||none||good||none||none||none||none|
If you think I'm wrong on any point, then feel free to comment. Am I missing anybody? I didn't include BlogPulse, PubSub, Blogdigger because all three seem to be experiencing brokenness these days.
Notes: I included Bloglines as part of AskJeeves. Bad usually means, it's there, but it doesn't really work. Yahoo! actually had multiple offerings in many areas like (Flickr and Yahoo! search both having image and tag searching).
Who'd a thunk it. You can now read entire novels over RSS, thanks to Charles of Surfarama. Simply subscribe to this feed and you'll be delivered Cory's new book "Someone comes to Town, Someone leaves Town", one chapter at a time for the next few weeks.
Randy: I think this will all depend on momentum. Back in 2003-4, people kept finding new ways of using RSS; weather, news, search results. Will the sample happen to OPML? It already is, rssWeather.com maintains it's list of RSS feeds in an OPML hierarchy. Gada.be enables subscription to its aggregated services thru an OPML file. More and more aggregators are announcing import and export of RSS reading lists in OPML format.
Randy: The published lists of deleted subdomains is a great idea. If this were ongoing and there was participation from all the major blog hosts, then search engines could quickly clean up their databases when new splogs are detected. Maybe we need a splogs.com like weblogs.com for sharing new uncovered splogs.
SingleSub is an set of services for HTML coders to provide new ways for blog readers to subscribe to blogs and other RSS-enabled content. They have an amazing OPML listing of many of the more popular URL-based subscription services. They also have a chicklet generator. The project is open source.
Jakob Nielsen: Test your weblog against the following usability problems.
Randy: These are actually much better than the usual top ten mistakes made by bloggers. I'm gonna work on #1, 2 and 5.
The topic of the day, is Blogspot splogs. I gotta admit, my own RSS reader is giving me more SPAM than anything these days. Many in the blogosphere have pointed the finger at Google and want Blogspot taken down until they can fix the splog program. This is very short-sighted. You see, if Blogspot wasn't hosting these splogs, then somebody else would be. The reason sploggers have chosen Blogspot, is that it's a very popular blogging platform that supports an API. If not Blogspot, then sploggers would host on 21publish or Blogspirit or Blogware or Wordpress. Turning off Blogspot, might quell the splogs in the short-term, but the long-term plan has to involve the blogosphere search engines figuring out what is worth indexing and what is not. When the blogosphere search engines point their fingers at Blogspot, then they are simply playing a blame game. How hard would it be for the search engines to detect these splogs. Not very. It could be automated quite easily. If every blog entry contains the exact sample content pattern "(<b>(any*)</b><br/>(any*)<br/>)*, then you can be 99% sure it's blog SPAM.
At this point, I'm getting pretty tired of the lame blogosphere search engines. I really wish somebody would create one that actually worked. Beyond the constant easily detected SPAM that escapes them and the finger pointing that follows, none of the blogosphere search engines are all that good at capturing the blogosphere conversation and an HTTP 500 error isn't exactly out of the ordinary. Enough complaining, here's my report on the state of the blogosphere.
This stuff generally works.
Blogosphere Search Engines
This stuff generally doesn't work.
Bloglines is the best blogosophere search engine at capturing link data, but is absolutely horrible at capturing entries via keyword search and is down a good fraction of the time. The positive is that it's the only blogosphere search engine that report more than 50% of my inbound links. The negative is that the most common response from Bloglines is "There is a problem with the database. Please try again later" and the keyword search is simply broken. Try this, do a keyword search on Bloglines and make certain to sort by date. Now, scroll thru the entrie with attention on the dates. Note, they are not sorted by date. Further, the blog matches on common keywords overwhelm the results, making it a chore to page thru to the entry matches. Where's the RSS?
Technorati works for brief periods of time, but is broken more often than not. I don't know how many bugs I've filed with them and most remain unfixed. Recently, I noticed all my blogs stopped showing up in Technorati altogether. When I checked my profile, all the records of my blogs had been corrupted and I had to reclaim them all. Not the first time. When Technorati tag or keyword search are working, they are clearly the best, but unfortunately they work infrequently.
IceRocket is the pleasant surprise in the bunch. IceRocket is one of two search engines that almost always responds in less than one second (the other is Google). In fact, all the other search engines often respond in ten second or more. I find myself using IceRocket more and more, simply because I know I won't be frustrated and they consistantly report good results. That said, they are tracking much less than 50% of the blogosphere, which means you still have to compliment it with other search engines to find the majority of the results you are looking for. I think IceRocket's biggest problem is that not enough blog hosts are setup to ping IceRocket by default.
Google Blog Search
Google is the little brother that could grow up and become that blogosphere search engine that I always wanted. I can see the promise, but it's still not there yet. Like IceRocket, it responds fast, but tracks much less than 50% of the blogosphere.
Blogpulse, Blogdigger, PubSub
These blog search engines generally don't work. They fail to capture 80% of the data and report more bad data than good.
David Sifry: It is that time of the year again, and I've got some new information on the continued growth of the blogosphere. I made this presentation as part of my 10 minute talk at Web 2.0 on October 6, 2005.
Randy: I think the state of the blogosphere can be summarized as "Technorati is now tracking 19.6 Million weblogs" and Blogspot has 18 blogs alone. Two causes. Blogosphere search engines like Technorati are tracking less than 50% of the conversation (excluding blog SPAM). The blogosphere (in particular Blogspot) is full of splogs.
Chris Pirillo: In the past few days, I've been inundated with an enormous amount of subscribed search spam for designated keywords. 99% of the crap coming in is directly from a single domain: blogspot.com.
Randy: Confirmed! I get much of the same. At one point, it looked like Google was shutting down the Blogspot SPAM and even shutting down Adsense on splogs. This seems to have stopped. I don't think Google needs to shutdown Blogspot.com, but surely they need to devise a strategy to stop the splogs.
Nathan Weinberg: You know those posts Weblogs Inc feeds us about once a week, the ones that tell us âThe Best Of Weblogs Incâ? Theyâre all gaming Technorati. [cut] Since these posts are replicated across all of Weblogs properties, getting seven links in the post can translate to well over a hundred links in just one day. This is something all the blog search engines need to work around, or that Weblogs needs to stop doing until they do.
Philipp Lenssen: By the way, both the Weblogs inc Luxist Estates blog, the Autoblog, the TV Squad, HD Beat, as well as the Card Squad blog, have managed to sneak into the Blogpulse top blog posts list (sometimes, with multiple posts) for today: http://www.blogpulse.com/05_10_15/topWeblog.html.
Randy: I wouldn't consider this gaming the system at all. You see, Weblogsinc has a legitmate reason for posting these self referential links. Rather, I'd blame the search engines for simply reporting bad data. It's not like nobody reads Weblogsinc blogs. It's not like Weblogsinc blogs are splogs (SPAM blogs). They are simply good at self promotion. That self promotion is a reason I gave for unsubbing from some of their blogs and I'm sure others have done the same.
Tech.Life.Blogged: I created an RSS feed on a server. There isn't anything in it so it doesn't change. [cut] Next, I created a Feedburner feed for that RSS file and used their Link Splicer tool to include a daily summary of my del.icio.us account. Then I used R|Mail to subscribe to the feed and have it sent to my GMail account where I have a filter to send it to Bloggers mail in tool.
Randy: This might be too difficult for the mundane user. What we need is for FeedBurner to directly post the del.icio.us links onto your blog using MetaWeblogApi and later Atom.
Steve Rubel: AOL and Intelliseek on Monday plan to unveil a blog content deal. Sue MacDonald at Intelliseek confirmed that the deal - set to be announced Monday at 7:30 a.m. - will give AOL access to rich blog data that they will deliver to consumers.
Randy: The Web 2.0 bubble continues to heat up. I'm still holding out for $1B.
Randy: I'm experiencing the same problem, but not as bad as Chris. I haven't been index in one week, Chris hasn't in two weeks. I use FeedBurner's PingShot with all my feeds and I have a few of them that get regular posts. Maybe PubSub is trying to help me with my bandwidth problems Further, I now get a litter of PHP warnings whenever I access any page on PubSub
Jon Hughes: The recent announcement of Yahoo! Podcasts has led to a number of podcasters asking us how we help them get listed in the Yahoo! podcasts directory and whether we're going to support one-click subscriptions. The answers are yes and yes, in no particular order. [cut] FeedBurner now supports the "pcast" one-click subscribe method for iTunes and Yahoo! Music Engine. Just ensure that your podcast uses the "Browser-Friendly" service with the "podcast" theme, and you'll be all set.
Randy: FeedBurner makes podcasting simple for the mundane users. They are driving us techies to the employment center.
Today, my domain's PubSub LinkRank is 55. On Monday, it was 24,058. My InLinks have increased a bit, but not substantially. Makes me wonder how they calculate this thing. Is there a random number generator in there? The jump placed me on their top 100 sites and 10th biggest rank gain on the day. I imagine on Tuesday, I must have been one of the top 2-3 gainers, but I didn't check. I also notice that RTGconsultants is ranked #88 with exactly 1 inbound link. Yep, it's a random number generator. Yellow Rat is #91 with exactly 1 inbound link. ipressroom is #19 with 3 inbound links. AlCantHang is #5 with 16 inbound links. Sounds like LinkRanks is broken again!
Ozh: This is such a simple idea that it must have been done before, but I just couldnât find any PHP script doing this : create an RSS feed from your daily Adsense earnings, so that you can easily track them in your regular feed reader. So well, here is mine : Adsense Earnings RSS Feed.
Nick Bradbury: This setting - which is available on the "Advanced" tab of each feed's properties (
Edit|Feed Properties) - enables automatically removing a feed if it hasn't had any new posts within a specific number of days.
Randy: Very cool! Nick's always pushing the RSS aggregators to the next level. Everybody else will have this feature in a year or two.
TechCrunch: Memeorandum finds blog posts, newspaper articles and press releases that are being heavily linked to in near real time and puts them up on the site. The position and size of the headline is indicative of its importance (determined by number of links and other factors, such as how much people are writing about the linked content). The higher up and bigger the headline, the more important it is. And linking sites, the conversation, are clustered underneath the headline. This means you can find out in near real time what is important in technology (or politics), how important it is, and whoâs talking about it. If you then post on the subject, you will be linked into the discussion as well.
Randy: There's one big downfall to Memeorandum. We all seem to be discussing the same subjects now. We find them on Memeorandum and blog them. This is shortening the long-tail.
Steve Rubel: Bloglines has added some nice keyboard shortcuts that streamline RSS reading.
About a year ago, I added MessageCast chicklets to the right sidebar of The RSS Blog and iBLOGthere4iM. Since, then MessageCast was purchased by Microsoft. In the six months since, I've had to twice change the code on my pages at Microsoft's request. The emails I receive from Microsoft are no less than cryptic. I couldn't image a mundane user being able to handle their instructions. Why they can't simply redirect the existing clicks to their new pages can only be explained as stupidity? Further, when they send the alerts via email, they stick my email address in the From field. This results in about one email per week from someone trying to unsubscribe or a friendly "I'm on vacation message". Very annoying. Next time I update my blog template, MSN alerts will be gone. Sorry!
PR: The partnership with FeedBurner enhances NewsGatorâs Private Label solution with new capabilities that provide online publishers and media companies the ability to closely measure performance, and monetize content delivered through NewsGatorâs Private Label solution.
Randy: This kinda sounds like NewsGator has integrated the FeedBurner APIs into their private label offering.
Nathan Weinberg: Google has silently added a Bookmarks feature to My Search History, enabling you to quickly tag and comment any web page youâve visited.
Steve Rubel: Here's a bunch of bookmarklets that I use every day in Firefox.
PR: The newest version of the Attensa RSS reader for Outlook includes a Firefox toolbar that makes it easy to find, preview and subscribe to RSS newsfeeds from any Web page or blog offering RSS feeds.
Yahoo! 360 Team: One other new feature that you might find interesting is the Blog This and Blast This buttons on the Yahoo! Toolbar. With one click, you can blog about the web page you're at or add that link to your blast. âDownload the Y! toolbar, if you donât already have it.. Once you have it, go to Add/Edit Buttons and select the Yahoo! 360Â° button in the Connect with Others section.
click the photo thumbnail to enlarge
Randy: Yahoo! displays the blog search results in the right sidebar of the News search results. If you click on More Blog results, you'll get the blog search results full screen. At the bottom of the right sidebar, you'll find an RSS feed that is currently inaccessible.
There's no blog search form, but you can use mine below.
Tom Foremski: VeriSign is about to announce it acquired Moreover Technologies, the San Francisco based news aggregator. The acquisition price is around $25m according to SVW sources.
Randy: The acquisition of Weblogs.com is starting to make more sense. They can now use the data from Weblogs.com to turn Moreover into an Internet version of Reuters.
RSS Blog reader: Why you didn't have google reader and feedster links in the chicklet generator?
Randy: The Google Reader is not there because it was recently released and I haven't taken the time to update the chicklet generator. I removed My Feedster awhile back because it was too rarely clicked and I wanted to limit the size of the widget. There's also a half dozen other chicklets that I've added and removed over the years to limit the size of the widget.
Photo Matt: The state of the ping community is fairly bleak What do we need to keep a BigCo from exploiting this space? A free, open, non-profit, and stable alternative supported by a consortium of organizations who understand that value should be built on top of pings, not in front of them.
Randy: Matt is just plain wrong. Sometimes, you have to realize that a small or not-for-profit company does not have the resources to make stuff like this happen and a BigCo becomes a necessity. This is a perfect case in point. Weblogs.com has been a great blogosphere resource and without it the blogosphere wouldn't exist. But, it's no longer moving forward because it needs financing. Thanks Verisign and thank you Dave for the first 10 years.
MSFT Team RSS Blog: The choice of what icon to use is challenging because it should be universally symbolic, but today there is no single icon for that represents feed. Instead thereâs a variety of mostly orange rectangles with the words âXMLâ, âRSSâ, âATOMâ, âFEEDâ, or âSubscribe.â
Randy: Microsoft is looking for help to determine what icon they should use in IE7 to represent a feed (RSS or Atom). The icons they are considering are all images. I'm partial to something readable, like Subscribe.
Amit Agarwal: Here's a short tutorial on how to enable trackbacks in Blogger.
Blogger Buzz: By turning on Backlinks, we include a "Links to this post" section on your post pages. This section is populated by links to that post that have been made from other blogs across the web.
Randy: I'm wondering how they'll handle blacklink SPAM?
Chris Pirillo disabled anonymous comments in early July. At some point, he re-enabled them. I guess I'll be visiting his blog more. When I posted about this, I mentioned that I'd be turning trackback back on to see if any got thru. I think a couple did, but not enough and I turned it back off.
Around the same time Scoble was talking about a new killer app that would replace comments. That turned out to be Memeorandum. Memeorandum has had a changing affect on the blogosphere, but it doesn't replace comments. Mostly Memerandum has ensured that everybody is blogging about the same 3-4 subjects everyday.
Google has officially joined the Web based RSS aggregator business. It's called Google Reader. The UI is very pleasing.
Experience: I started by search for my iBLOGthere4iM blog. I found a lot of blogs referring to it, but couldn't immediately find my blog. I then tried to import a small OPML file. It hung reporting "Your subscriptions are being imported..." After awhile, I got bored as it was completely non-functional. I exited the browser, got back in and upload the big OPML file. Same results. I can't seem to get past first base. Definately BETA.
Read the announcement on the Official Google Blog.
More Experience: Once it stableized, I found Google Reader amazing. The UI is very cool! The problem is it just doesn't work. Constantly hangs up and it's showing me crap from 1932. It'll take Google a month or two to get rid of these usability bugs (remember Gmail?) and then Bloglines is in big trouble.
Problem: OK, here's the biggest problem. I set it to sort by date and the top item is from April? I start scanning thru and the items are definately not ordered by date, it's all over the place. If they fixed this, then this would be usable.
This week, I was testing blog keyword searching on Technorati, IceRocket, Google, BlogPulse, Blogdigger and Feedster. This is part of a series of blog entries about finding new links using the various blog search tools.
Here's the final standings, based on 5 valid results for a search on the keyword "KBCafe".
Randy: Is this acquisition week? If true, congrats to Dave. I assume he's doing this to get some help building out weblogs.com. Oh, and the money doesn't hurt. I'm still accepting bids for KBCafe.com. I'll come down from my original asking price of $1B. How about $999M? ... Anybody? ... Hello? ... Is this on?
Jason Calacanis: Iâm in a conference room at AOL right now running down a list of 15-20 press calls with Jim Bankoffâ¦ exciting news.
Randy: This is really exciting. A blog authoring company can be sold for $25 million. That's a valuation measuring stick that can now be used to value other blog authoring companies.
Darren Rowse: Iâve just been forwarded an email which was sent from Weblogs Inc Co-Founder Brian Alvey to Weblogs Incâs bloggers. It was sent to me by WIN blogger who wants to remain nameless.
Reuters: America Online Inc. has agreed to buy Weblogs Inc., a network of Internet sites focused on niche topics ranging from food to gadgets, for around $25 million.
Follow the story at Memeorandum.
This is a clever way of comparing blog search engines. He didn't like the results from IceRocket because he didn't get any recent entries from them. A small correction shows that IceRocket does show recent entries linking to him, but it would seem there something is wrong with his test case (either the test or at IceRocket).
The Web 2.0 acquisitions continue, Jason Calacanis's Weblogsinc.com has been sold to AOL.
Randy: Big congrats to Jason and team. Is the Web 2.0 bubble inflating or what? Are we gonna have acquisitions daily till NASDAQ hits 5000? I'm still waiting for Google to buy KBCafe for $1B.
Randy: I don't understand why Technorati, PubSub and Feedster have so much trouble compiling their top blogs lists. It's pretty easy. Even I have one and I doubt I'm missing a top 10 blog from my top 100.
A reader asked how they could put a KBCafe search widget on their blog. Like this.
It's really simple. Here's the code. <form action="http://www.kbcafe.com/search.aspx" method="get" >
Next week, I want to start an affiliate program with this code and allow select users to embed their Adsense client ID inside the results. If you wanna Beta test this, then drop me an email and your Adsense client ID.
Om Malik: Ranchero Software, the company behind Mac-a-licious RSS reader NetNewsWire. Sources in Silicon Valley tell me that this deal is pretty much done, and expect the deal to be announced sometime on Wednesday at Web 2.0. Greg Reinacker refused to comment.
Randy: More consolidation. NewsGator is definitely looking like the one stop shop for RSS readers. They now have NewsGator Online (Web aggregator), NewsGator Outlook (Outlook aggregator), FeedDemon (Windows aggregator) and NetNewsWire (Mac aggregator). That's four of the top 13 RSS readers of The RSS Blog according to FeedBurner readership stats.
Update: Rumours confirmed.
type is a string, it says how the other attributes of the <outline> are interpreted.
Type and text are the two most common attributes. The type is used to describe the type of current <outline>. For instance, if the <outline> is a container of other outlines, the type attribute is most often missing. Whereas, when the OPML editor identifies a link to another Web document, the <outline> type is often set to "link". RSS readers that import and export blogrolls and reading list, often set the type to "rss". The capitalization of type attribute is not standard and is often written "RSS". I suggest OPML readers treat the type attribute as case insensitive and publishers avoid confusion by using lower-case "link" and "rss". On a rare occasion, I've seen the type set to 'atom' when the <outline> points to an external atom feed, but the convention has been to set the type to 'rss', even when the <outline> is an external atom feed.
text is the string of characters that's displayed when the outline is being browsed or edited. There is no specific limit on the length of the text attribute.
As I'll explain later, some authors have chosen to use title in place of text. The definition of text from the spec, I believe is obvious and doesn't warrant further explanation.
isBreakpoint is a string, either "true" or "false", indicating whether a breakpoint is set on this outline. This attribute is mainly necessary for outlines used to edit scripts that execute. If it's not present, the value is false.
isComment is a string, either "true" or "false", indicating whether the outline is commented or not. By convention if an outline is commented, all subordinate outlines are considered to be commented as well. If it's not present, the value is false.
The isComment and isBreakpoint attributes are IMHO very rare and I can't add any detail beyond the spec.
Following are the <outline> attributes, not described in the OPML spec, but often found in the wilderness of the Web.
A very common mistake in OPML is to use the title attribute instead of text attribute when giving the outline a title. This mistake was made by the implementers of the first RSS readers that imported and exported OPML. The error has propagated as most OPML programmers simply copied these initial implementations, without reading the specification. The mistake is confusing OPML authors. I suggest OPML readers should check for both text and title and publishers should avoid title entirely, or publish both title and text identically.
The OPML editor and many RSS readers use the url to indicate the external location of the <outline>. In the OPML editor, this may be a simple Webpage, Weblog, PDF or any document addressable on the Web. RSS readers usually use the url attribute to indicate the homepage of the blog or RSSified resource.
Although many RSS readers use url to indicate the blog or RSSified resources, others use htmlUrl to indicate the same. I would treat them as the same and suggest OPML readers check for both.
The xmlUrl is usually set to an RSS, Atom or OPML file. For instance, RSS readers will often set the type to "rss", the url to the Webpage and the xmlUrl to the RSS feed URL.
The OPML editor uses the created attribute in an RFC 822 date format to indicate when the <outline> node was created (I guess that was obvious).
This week, I'm gonna switch things up and do the same test using keyword search that I usually do with link search. That is, instead of looking for new links to http://www.kbcafe.com, I'm gonna search for the kbcafe keyword. The contestants and contesting URLs are...
I've done this primarily because Bloglines usually wins my blogsearch competitions, but lacks a competitive blog keyword search feature and secondarily because Feedster sucks at link search and is supposedly much better at keyword search. If you want to help, remember, you can't just link to this post or my domain, you have to put the term kbcafe in your post. Thanks!
Only results that were posted on Tuesday (Sept 4th) or later will be considered.
The Fishbowl: To allow the format to be flexible and extensible, OPML producers can add arbitrary attributes to outline elements. While types and attributes are arbitrary, the specification does not provide implementors a mechanism for finding out the meaning of either.
Randy: Definately a major failing of OPML. Further, it's amazing to me that these authors continue to miss pointing out that there is no specification for the current version of OPML (1.1) and continue to point to the old spec.
Tim Bray: SPARQL is an answer to the question âWhat if I want to do SQL-like querying when I know perfectly well that everybody will be using their own incompatible database schema?â Iâve been a SemWeb skeptic [cut]. Hey, isnât Guhaâs Alpiri project more or less that back-end? And isnât Guha working at Google now?
Randy: Every time I see a SemWeb blog entry, I become more and more convinced that RDF is not worth the effort. Not because the ideas aren't great, but because the implementations are always half-baked. SPARQL is a great idea. The problem SPARQL has been a working draft for a year. I'm simply tired of these persistent half-baked working draft projects. Take a look at FOAF, it's been a working draft for 5 years. The RDF people don't seem to understand that half-baked specs end up on the cutting floor.
Russell Beattie: Has anyone noticed that Bloglines is really suffering since it got bought by Ask Jeeves?
Randy: I'm a big Bloglines fan. Let's see if I can back them up.
Russell: Bug #1: If you go to this page, youâll see that because my old Java system insisted on adding in a ;jessionid to the end of my feed url, it looks like I had hundreds and hundreds of feeds.
Randy: Russell, sounds more like a bug in your software. Mind you, in my own travels, I've stopped using the URL of feed that was given me and used auto-discover on the channel/link inside the feed to rediscover the actual feed, a.k.a. reflexive auto-discovery. This eliminates Russell's problem and many others.
Russell: Bug #2: If you try to use the Blog Citations feature, most of the time itâll come back with a database problem.
Randy: Agreed, but when it does work, it kicks everybody's ass.
Russell: Lack of Innovation #1: Whereâs the updates to the UI!?!?
Randy: Agreed, the UI has always sucked. I guess, overall, I agree with Russell on each point, but disagree on the aggregate, Bloglines is still #1.