Jekyll and live feeds update
Before I use Jekyll, Wordpress was running my blog. One thing I noticed while using Wordpress was that Google and other blog search engines were fetching my new posts a few seconds after I published them.
To achieve these performances, Wordpress use two different systems:
It sends a ping to some services which in turn fetch your feeds. Some concentrators such as ping-o-matic allow you to ping them, and they in turn ping various search engines for you so that you don’t have to. Then each search engine decides whether or not it will crawl your blog again.
Wordpress also uses the recent pubsubhubbub protocol (what a lovely name!) In your feed, you declare the address of a hub where interested parties can send subscription requests. Then, when a new article is published on your blog, Wordpress sends a ping to the hub, and the hub retrieves your feed. If the feed has changed, it is sent to the subscribers using a callback address they registered when they subscribed. This way, interested services such as Google do not have to retrieve the feed themselves, as it will get pushed to them when it contains new items.
It is easy to enhance a Jekyll blog with the pubsubhubbub system, because:
- there exists public open pubsubhubbub hubs, such as the well known https://pubsubhubbub.appspot.com;
- you may send the ping message from everywhere, not necessarily from the server.
The first thing to do is to add hub information in your Atom
or RSS feeds. For an Atom feed, you may add
the following into the feed
section
feed xmlns="https://www.w3.org/2005/Atom">
<link rel="hub" href="https://pubsubhubbub.appspot.com"/>
<
...feed> </
while a RSS feed would contain
rss xmlns:atom="https://www.w3.org/2005/Atom">
<channel>
<atom:link rel="hub" href="https://pubsubhubbub.appspot.com"/>
<
...channel>
</rss> </
Then you may want to ensure that you can tell the hub that your feed has some fresh interesting content
by pinging it. If you don’t, your feed will be retrieved at regular intervals, but you will lose the
benefit of using pubsubhubbub. If you are using rake
for your development, you may want to create
a :ping
task which will send the ping when you run it:
'Ping pubsubhubbub server.'
desc :ping do
task require 'cgi'
require 'net/http'
'Pinging pubsubhubbub server'
printHeader = 'hub.mode=publish&hub.url=' + CGI::escape("https://address.of.your/feed/")
data = Net::HTTP.new('pubsubhubbub.appspot.com', 80)
http = http.post('https://pubsubhubbub.appspot.com/publish',
resp, data
data,{'Content-Type' => 'application/x-www-form-urlencoded'})
puts "Ping error: #{resp}, #{data}" unless resp.code == "204"
end
If you prefer to use make
, then a similar target using wget
or curl
would do the job. The only
thing you need to do is send a POST
request to https://pubsubhubbub.appspot.com/publish with an
URL-encoded form containing the following two fields:
hub.mode
: a single stringpublish
.hub.url
: the URL of your updated feed. This can be repeated multiple times if several feeds have been updated at once.
Note that in the real life, my rake
rule is much more complex: since I have separate feeds for the
two languages I use on this blog, as well as one feed per tag, my Rakefile
contains code to check
whether posts have been updated in the last 24 hours, and all the feeds that might have changed
(and only these) will be signalled to the hub.
What can you do with those realtime updates? You can start using services such as twitterfeed to post twitter notices of your blog posts right after they appear on your site, or you can use PuSH Bot to get live updates in your XMPP stream (in Google Talk for example). This is really as easy as pie, there is no reason your blog should not be using it right now.
How will I publish this very post? I will just do
rake install ping
and be done with it.