Real-time Web and Streaming APIs

motif

There was a lot of buzz a few years ago around real-time web and since then it has been bubbling along. I have a financial/enterprise background so real-time has a very different meaning to me; time is measured in microseconds. Web real-time seems to be measured as sub 1 second . My issue with real time web to date is only parts of the web are web-real time.  While the data can be delivered to the browser using push technologies such as comet and web sockets, the vast majority of REST and soap API that provide access to application data still use the HTTP request response model.

That’s starting to change with more public streaming APIs appearing. A streaming API (aka HTTP Push) works by the client opening a socket, providing some criteria of the data it wants to receive and the server will deliver new data as it is received over the open socket. For those familiar with publish-subscribe models of delivering data, this all sounds familiar.

HTTP Push evolution

Most data streaming initiatives on the web focused initially on content feeds such as Ping-o-matic, SPinn3r ro RSS Cloud. This is great for web content but new SaaS applications and services are not using ATOM feeds to deliver application data.

One interesting HTTP push technology is PubSubHubBub. It defines a protocol for doing publish and subscribe over HTTP using ATOM as the message format. PSHB also provides a server that can be used to serve content to subscribers  It is used in things like WordPress, TumblrLiveJournal and more.  If you have a WordPress blog you actually have your own PSHB hub.

Streaming

Streaming APIs are provided by SaaS applications, social media platforms and other services to deliver data to clients in web real-time. The streaming API model is usually implemented for reading data. It is used to deliver data to consumers, not to make writes or deletes.  In theory you could perform writes over a streamed connection but its very inefficient and the request response model offers a better interaction since if you perform a write you want a response to that action.

Who Streams?

Streaming APIs are relatively new, a sample ones I know off-hand:

  • Salesforce – Just announced their new streaming API, We already support it in Mule, more on that in my next post.
  • Twitter – Getting real-time updates.
  • Facebook – Subscribe to real-time data changes in your social graph
  • SuperFeedr – push all sorts of feeds in one API (PSHB or XMPP)
  • Digg – stream submissions and comments.
  • Instagram – Real-time photo updates

Building Streaming APIs

There is currently no ‘standard way’ to build streaming APIs. Typically there are a few different approaches out there –

  1. HTTP using long poll – Long poll is a method used by some HTTP servers to hold a connection for a client until data becomes available on the server.  If data is immediately available the connection is not held.
  2. Comet over HTTP – Comet is a server-side pub-sub implementation designed around the Bayeux protocol. This is often used to enable AJAX capabilities on Java servers. Comet is stateful, which means you need to pass along information that is retained by the server.
  3. XMPP – Designed for publish subscribe on the web but is not HTTP-based.  XMPP has a lot of functionality beyond what is needed to build  a streaming API.
  4. HTML 5 Event Source – Seems to be an eventing protocol similar to Comet, but I have not dug into it yet.

If you know of other streaming APIs and other ways people are building streaming APIs I’d love to hear about it.

Follow: @rossmason@mulejockey


We'd love to hear your opinion on this post


8 Responses to “Real-time Web and Streaming APIs”

  1. What is a good approach in Mule to building streaming API’s? I have some Flex clients that get data constantly from Mule Jersey services, would be great to be able to implement a BlazeDS service in Mule, seems like right now just stuck using short polling.

  2. HTML5 Server sent events never seemed to take off in popularity, all the cool web people are using Web Sockets (also HTML5), http://www.w3.org/TR/websockets/

  3. Jonathan, The issue with websockets is that it is just a socket, you still need a protocol on top for managing the publishing of data to clients that requested it. Comet for example could be used over websockets

  4. Craig, I don’t know anyone working on a BlazeDS connector. If you know anyone that would be interested in doing it, please introduce me. The other option is to use Mule’s AJAX (CometD) connector in front of your Jersey resources, but that may complicate your front end (I don’t know much about Flex and whether you’d be able to mix flex and the mule JS client)

  5. […] Developers often need to call a API to get data updates, only to find that nothing has changed. Streaming APIs provide a more elegant solution to polling allowing developers to subscribe to changes they are […]

  6. […] discussed recently in this blog, web streaming APIs are a hot topic. One goal of streaming APIs is to reduce polling and replace it […]

  7. […] The recently introduced PubSubHubbub module opened the door to server-push web-based integration with Mule. This approach, which is more resource friendly than the traditional pull integration that relies on polling resources, is currently gaining a lot of traction as is the industry is moving towards realtime and streaming web APIs. […]

  8. I know this is an old article, but want to respond anyway. For websockets there is a protocol that supports publish/subscribe and remote procedure calls: http://wamp.ws/. I would love to see support for WAMP (Web Application Messaging Protocol) in Mule!