it@cork Web2.0 Conference

Event type: Conference

Date: 2006-06-08

Rating: 5 out of 5

Salim gave a suberb presentation which for me, as expected, was the highlight of the conference. I have heard chunks of the presentation on various Podcasts and it is still very focused on the ideas they built in PubSub despite the fact that Salim left there several months back. But given that those ideas remain as relevant now as they were when PubSub was founded, the presentation was compelling.

He started by laying the general groundwork of blogging and Web 2.0 and highlighted things like:

  • Web 1.0 vs Web 2.0 (courtesy of a certain O’Reilly)
  • Sun’s Schwarz “I don’t read blogs, I read”
  • The focus on Services rather than data
  • and
  • 30m in bloggers in Korea that are not on the radar yet, 30m bloggers in China
  • FeedMesh
  • Business Blogs for Knowledge Management internally
  • Business Blogs for Marketing/PR/CRM externally

Salim then moved on to the visible web (standard web content) vs the hidden web (data inside proprietary systems). The big problems with the hidden web are:

  • The data is inside walled gardens
  • It can’t be syndicated (i.e you can’t subscribe to a data feed from such systems
  • It is not available to search engines
  • Those walled garden businesses are valued based on that data and so do not want people crawling it for their own purposes
  • The hidden web is estimated to be 500 x bigger than visible Web

As we move more and more data to the visible web, services can then be built around that data. Salim then gave some examples of where this is going including (of course) Structured Blogging, Job Ads to disintermediate Monster, reviews (like this one!), subscribing to searches like your Social Security Number (and hope you never see it appear anywhere!).

He sees that we are moving from the activity of Sending to Searching to Watching and that search is not just historical but more importantly “Tell me whenever” prospective search. This is the business model that PubSub is built on and I am convinced that the idea is completely solid. Unfortunately i have tried PubSub several times over the past 18 months or so and it simply does not do the job for me. It either find no data matching my simple terms or it floods me with spam on the RSS feed. Until they provide quality results quickly, they will always play second fiddle to Technorati (and the same basic idea of subscribing to search terms and having an RSS feed of that is available in Technorati too).

Salim then moved on to the impact of these types of technologies on business. I found this section rivetting but he clearly lost some of the audience at this point as the questions asked at the end of his presentation indicated misunderstanding of the ideas by some of those questioning.

He pointed out that businesses really run on events (new customer, price change, delivery notice), not data. However most systems are built around data. Moving to a events based view of the world could have major efficiency gains for businesses. He sees a big opportunity to build tools for the enterprise that monitor events in internally published RSS feeds based on detailed criteria.

Obviously these RSS feeds could not simply be raw data and would require some sort of structure. Whilst Salim mentioned microformats at one point, it is clear that they (or something like them) are key to enabling an intelligent publish and subscribe system.

The core engine in PubSub has an architecture that is the reverse of a normal database. In PubSub’s case, the system holds millions of queries against which each new feed-item is matched. Currently they run over a trillion matches a day.

Salim got a ton of questions!

He had mentioned having sentient devices generating feeds to which you could subscribe (temperature gauge for example). One question was how to handle so much data? Simple answer was “filter”.

Another question was about the skill sets needed to build these new aggregators. The answer was DB Trigger experts, Realtime Messaging experts, Supply Chain mgt experts.

Salim mentioned the opportunities for Mashups in the enterprise. Imagine an executive dashboard system showing in realtime the special offers being redeemed in shops overlaid on a map so you can see which offers work best in which locations!

One excellent question addressed the issue of spam blogs (splogs) and how aggregators could avoid having their systems polluted with such data. Salim mentioned reputatation systems, and Identity 2.0. Obviously something like sxore fills this gap.

Another question was on exisiting data in silos - how to extract it into RSS feeds? Salim had no real solution for that other than to make all new data available. At this point the confusion in the audience between RSS feeds, microformatted data and blogs became apparent. None of the enterprise examples given by Salim require a blog - they do require new applications which subscribe to certain types of data and then present that to a user the way they need it. Salim gave another example of a Sales Manager having a subscription to all new sales leads in their CRM system but he only gets notified of those ones over a certain value but in pseudo-real-time.

Salim is giving a keynote in 6 weeks time in North Carolina at an XML conference to delve into implications of XML syndication. I’ll be watching out for Podcasts of that.

One brave soul asked about the disgraceful tabloid-style posting on Techcrunch saying that PubSub was about to implode. Salim addressed the question head on and said that whilst he disagreed with the direction the company was heading in (hence his exit), they still had many opportunities open to them.

He mentioned that he is currently starting three companies all of which will go from start to launch in three months. Based on the focus of his presentation, I would be surprised if some or all of them are not looking at the enterprise market. He remains a man to watch very closely.

[tags]Web2.0, Web20, it@cork, Salim Ismail, PubSub, microformats, Structured Blogging[/tags]