PubSubHubbub was concieved as a protocol for delivering push notifications of updates to Atom and RSS feeds, and I think most would agree that it has been somewhat successful in doing so. However, almost immediately people became interested in either making it support other specific serialization formats (I penned a variant for streams of JSON objects, for example) and ultimately making it general enough for any arbitrary data type.
Attempting to support arbitrary data formats exposed a number of weaknesses in the original protocol, the main one being that the signature used for authenticated notifications applies only to the body of the notification. This was not too big an issue when the payload was constrained to being a valid feed, but with support for arbitrary resources comes the need to support the HTTP headers that describe the payload — Content-Type in particular — and these really need to be be included in the signature too in order to prevent a class of attack where a request is intercepted and altered with a new set of headers in order to obtain a more harmful interpretation of the existing payload.
At the 2010 Federated Social Web Summit I suggested the solution of making the notification body be an entire HTTP response rather than just a payload, which Joseph Smarr lovingly branded a "turducken solution". Of course the problem with this approach, as I acknowledged at the time, is that most web frameworks out there are not equipped to parse an HTTP response bytestream out of the body of an HTTP request, and so this can be tricky to implement on some popular web application stacks.
Today I offer a new solution that arises from looking at PubSubHubbub from a different angle. Rather than thinking of it as a means to notify of new items in a stream, instead we can think of it conceptually as a protocol for mirroring resources.
If you frame the problem in terms of resources — a fundamental HTTP concept — then this brings us closer to HTTP and allows us to make better use of the facilities that HTTP provides. In particular, we can represent update notifications with HTTP PUT requests:
PUT /example.jpg HTTP/1.0 Content-Type: image/jpeg Content-Length: 2545 Host: example.com Authorization: HubSignature 103456 abcd1234abcd1234abcd1234abcd1234abcd1234 (image payload)
HTTP already defines the how to use entity header fields with a PUT request to provide the metadata for an entity body, so we can use this as the format of a "fat ping". The only new thing in my above example is the hypothetical HubSignature auth mechanism, which I imagine to be a signature generated in terms of the Content-* set of header fields, the request method, the request URI, the payload, a nonce (103465 in this example) and the hub secret.
This has the advantage of being very close to what lots of web server software already expects from a PUT. Web servers and frameworks generally provide a mechanism for integrating new HTTP auth mechanisms so it this protocol could be handled relatively easily in (for example) Apache HTTPD by combining its existing PUT support with a new auth module. We could also just use Basic auth over HTTPS to transmit a shared secret in a manner that allows the processing of notifications with no new software at all.
If we're willing to explore a less proven part of the HTTP stack we could also exploit the newer PATCH method as a means to re-introduce optional delta-based notifications in a more general and less ambiguous way. We'd just need to figure out a means for the subscriber and hub to negotiate which patch document formats they both support. Leaving that problem aside for now, here's a patch notification using a hypothetical Atom patch format I just made up:
PATCH /example.atom HTTP/1.0 Content-Type: application/atom-delta+xml Content-Length: 346 Host: example.com If-Match: abcdabcd12341234abcdabcd12341234 Authorization: HubSignature cheese 1234abcd1234abcd1234abcd1234abcd1234abcd <delta:feed xmlns:delta="whatever" xmlns:ts="http://purl.org/atompub/tombstones/1.0"> xmlns="http://www.w3.org/2005/Atom" <entry> <id>tag:example.org,2011:entry2</id> <title>A new entry</title> <!-- etc, etc --> </entry> <at:deleted-entry ref="tag:example.org,2011:entry1" /> </delta:feed>
Naturally the DELETE verb can be used to complete this story by providing a means to indicate that a mirrored resource no longer exists, though of course the subscriber would be free to ignore this and keep the resource if desired.
As far as I can tell the only downside of this approach is its incompatiblity with the established PubSubHubbub protocol, but I believe adoption of PubSubHubbub for arbitrary content types remains low enough that the community could suffer a breaking change in the interests of better integration with existing HTTP features and tools.
What do you think?