Experimental | Brave New Method

Apple Push Notifications with Node.js

December 9, 2010 20 Comments

When your iPhone app backend needs to send Apple Push Notifications, it must do this over raw SSL socket using Apple proprietary raw binary interface. Standard Web REST is not supported. This kind of sucks, because if your entire backend is web based you need to break that cleanliness with external HTTP to APN proxy. One option is to use services like Urban Airship, but you can also build the proxy by yourself.

One potential platform for this is hyped Node.js, the rising javascript engine for building ad-hoc web servers. Web is full of examples of building simple HTTP based server or proxy with Node.js, so this post is only the part where we open a secure connection to the Apple server and send push notifications with plain Node.js javascript.

Please note that Apple assumes that you pool and keep sockets open as long as you have notifications to send. So, don’t make naive implementation that makes new socket for each HTTP request. Some simple pooling and reuse is a must for real implementation.

In addition for sending the push notifications, your app also needs to poll the APNS feedback service to find out what devices have uninstalled the app and should not be pushed new notifications. See more details in post Apple Push Notification feedback service.

1. Get Certificates

Apple’s Push notification server authenticates application by SSL certificates. There is no additional authentication handshake after secure connection has been established.

First we need the PEM format certificates that you can get by exporting them with Apple Keytool. Export also the Apple Worldwide CA certificate. See this excellent blog post (up to step 5) for details how to acquire the PEM files: http://blog.boxedice.com/2010/06/05/how-to-renew-your-apple-push-notification-push-ssl-certificate/

Now you should have following certificate files.

app-cert.pem (Application cerificate)
app-key-noenc.pem (Application private key)
apple-worldwide-certificate-authority.cer (Apple CA certificate)

2. Open Connection to Push Server

UPDATE:See more complete TLS example here.

Moving on the actual implementation in Node.js. This is quite simple, you just read the various certificate files as string and use them as credentials.

You must also have SSL support built in your Node.js binary.

var fs = require('fs');
var crypto = require('crypto');
var tls = require('tls');

var certPem = fs.readFileSync('app-cert.pem', encoding='ascii');
var keyPem = fs.readFileSync('app-key-noenc.pem', encoding='ascii');
var caCert = fs.readFileSync('apple-worldwide-certificate-authority.cer', encoding='ascii');
var options = { key: keyPem, cert: certPem, ca: [ caCert ] }

function connectAPN( next ) {
    var stream = tls.connect(2195, 'gateway.sandbox.push.apple.com', options, function() {
        // connected
        next( !stream.authorized, stream );
    });
}

3. Write Push Notification

After secure connection is established, you can simply write push notifications to the socket as binary data. Push notification is addressed to a device with 32 byte long push token that must be acquired by your iPhone application and sent to your backend somehow.

Easy format is simple hexadecimal string, so we define first a helper method to convert that hexadecimal string to binary buffer at server side.

function hextobin(hexstr) {
   buf = new Buffer(hexstr.length / 2);
   for(var i = 0; i < hexstr.length/2 ; i++) {
      buf[i] = (parseInt(hexstr[i * 2], 16) << 4) + (parseInt(hexstr[i * 2 + 1], 16));
   }
   return buf;
 }

Then define the data you want to send. The push payload is a serialized JSON string, that has one mandatory property ‘aps’. The JSON may contain additionally application specific custom properties.

var pushnd = { aps: { alert:'This is a test' }};
// Push token from iPhone app. 32 bytes as hexadecimal string
var hextoken = '85ab4a0cf2 ... 238adf';

Now we can construct the actual push binary PDU (Protocol Data Unit). Note that payload length is encoded UTF-8 string length, not number of characters. This would be also good place to check the maximum payload length (255 bytes).

payload = JSON.stringify(pushnd);
var payloadlen = Buffer.byteLength(payload, 'utf-8');
var tokenlen = 32;
var buffer = new Buffer(1 +  4 + 4 + 2 + tokenlen + 2 + payloadlen);
var i = 0;
buffer[i++] = 1; // command
var msgid = 0xbeefcace; // message identifier, can be left 0
buffer[i++] = msgid >> 24 & 0xFF;
buffer[i++] = msgid >> 16 & 0xFF;
buffer[i++] = msgid >> 8 & 0xFF;
buffer[i++] = msgid > 0xFF;

// expiry in epoch seconds (1 hour)
var seconds = Math.round(new Date().getTime() / 1000) + 1*60*60;
buffer[i++] = seconds >> 24 & 0xFF;
buffer[i++] = seconds >> 16 & 0xFF;
buffer[i++] = seconds >> 8 & 0xFF;
buffer[i++] = seconds > 0xFF;

buffer[i++] = tokenlen >> 8 & 0xFF; // token length
buffer[i++] = tokenlen & 0xFF;
var token = hextobin(hextoken);
token.copy(buffer, i, 0, tokenlen)
i += tokenlen;
buffer[i++] = payloadlen >> 8 & 0xFF; // payload length
buffer[i++] = payloadlen & 0xFF;

var payload = Buffer(payload);
payload.copy(buffer, i, 0, payloadlen);

stream.write(buffer);  // write push notification

And that’s it.

4. Handling Error Messages

Apple does not return anything from the socket unless there was an error. In that case Apple server sends you single binary error message with reason code (offending message is identified by the message id you set in push message) and closes connection immediately after that.

To parse error message. Stream encoding is utf-8, so we get buffer instance as data argument.

stream.on('data', function(data) {
   var command = data[0] & 0x0FF;  // always 8
   var status = data[1] & 0x0FF;  // error code
   var msgid = (data[2] << 24) + (data[3] << 16) + (data[4] << 8 ) + (data[5]);
   console.log(command+':'+status+':'+msgid);
 }

This implementation assumes that all data (6 bytes) is received on single event. In theory Node.js might return data in smaller pieces.

5. Reading Apple Feedback notifications

Apple requires that you read feedback notifications daily, so you know what push tokens have expired or app was uninstalled. See this blog post Polling Apple Push Notification feedback service with Node.js for details.

Filed under Experimental, Javascript Tagged with apns, nodejs, ssl

Simple Reverse Geocoding with CouchDB

December 2, 2010 5 Comments

Real world reverse geocoding searches require minimum of three parameters, two for location (lat, lon) and a filter. Filter can be keyword, type of location, name or something else. Common use case is to find nearest restaurants or closest address. This kind of query is simple for relational database (though not necessarily easy to shard) and the problem has been solved cleanly in many of them. (PostGIS, MySQL Spatial extensions, ..)

Geocoding is trickier to implement in typical NoSQL database that supports only one dimensional key range queries. Geohashes are classical solution, but in my experience they are too inaccurate for dense data. Simple method that works quite well are geoboxes, where earth is divided to grid that is used as index lookup table. Every location maps to a box that can be addressed with unique id.

This is experiment to implement simple geobox based geocoding on CouchDB from scratch. I assume you’re already familiar with CouchDB basics. The examples here are written with Python with couchdb-python client library.

1. Preparations

First we need geobox function that quantizes location coordinates to a list of boxes. Boxes cover the earth as grid. Latitude and location are quantized to coordinates that present geobox center on desired resolution.

from decimal import Decimal as D
D1 = D('1')
def geobox(lat, lng, res):
  def _q(c):
      return (c / res).quantize(D1) * res
  return (_q(lat), _q(lng))

Based on this function, we define a function that computes the geobox and its neighbors and retuns list of strings.

import math
def geonbr(lat, lon, res):
   blat,blon = geobox(lat, lon, res)
   boxes = [(dlat, dlon)
            for dlon in [blon - res, blon, blon + res]
            for dlat in [blat - res, blat, blat + res]]
   def _bf(box):
       (dlat, dlon) = box
       return math.fabs(dlon - lon) < float(res)*0.8 \
              and math.fabs(dlat - lat) < float(res)*0.8
   return filter(_bf, boxes)

def geoboxes(lat, lon, res):
   return list(set(['#'.join([dlat, dlon])
                    for (dlat, dlon) in geonbr(lat, lon, res)]))

The constant 0.8 defines how close the location can be at the geobox border before we include neighbor box in the list.

For example, calling geoboxes with lat lon (32.1234, -74.23233) will yield following geoboxes. Numbers are handled as Decimal instances to avoid float rounding problems.

>>> from decimal import Decimal as D
>>> geoboxes(D('32.1234'), D('-74.23233'), D('0.05'))
['32.15#-74.20', '32.10#-74.20', '32.15#-74.25', '32.10#-74.25'

2. Data Import

Data can be anything with location and some keyword, so let’s use real world places. Place name will be our searchable term in this example.

Geonames.org geocoding service makes its data available for everyone. Find here country you want and download & unpack selected data file. I did use ‘FI.zip’.

http://download.geonames.org/export/dump/

Data is tab-delimited and line-separated. We need to define few helper functions for reading and importing it.

from decimal import Decimal as D

def place_dict(entry):
   return {'_id': entry[0],
      'name': entry[1].encode('utf-8'),
      'areas': entry[17].encode('utf-8').split('/'),
      'loc': {
      'lat': entry[4],
      'lon': entry[5],
    },
    'gboxes': geobox(D(entry[4]), D(entry[5]), D('0.05'))
 }

def readnlines(f, n):
    while True:
      l = f.readline()
      if not l:
         break
      yield [l] + [f.readline() for _ in range(1, n)]

The place_dict converts line from Geonames dump file to JSON document for CouchDB. The readnlines is just helper to make updates in batches. Geobox resolution is 0.05 that makes roughly 5 x 5 km geoboxes.

Then just load the data to database. First we create database in server, open the utf-8 encoded file and write it as batches to the CouchDB.

>>>import codecs
>>>import couchdb
>>>s = couchdb.Server()
>>>places = couchdb.create('places')
>>>f = codecs.open('/home/user/Downloads/FI.txt', encoding='utf-8')
>>>for batch in readnlines(f, 100):
...   batch = [l.split('\t') for l in batch]
...   places.update([place_dict(entry) for entry in batch])
[(True, '631542', '1-239590f242b46d45b33516687c0b1df3'), ...

This takes a few moments. You can follow the progress on CouchDB Futon: http://localhost:5984/_utils/index.html

The place_dict does not validate the content, so the import might stop at broken line in the dump file, in that case you need to filter out the offending lines and rerun.

Query few places by id and verify that the data really is there and has right format

>>> places['638155']
<Document '638155'@'1-174cbb83a2794c33c40645ddf681fc76'
{'gboxes': ['66.85#25.75', '66.90#25.75'],
'loc': {'lat': '66.88333', 'lon': '25.7534'}, 'name': 'Saittajoki',
'areas': ['Europe', 'Helsinki']}>

3. Define View

CouchDB views (i.e. queries) are defined by storing a design document in the database. The couchdb Python API provides simple way to update design documents.

The query we need is defined in CouchDB by following script

function(d) {
  if(d.name) {
    var i;
    for (i = 0; i < d.gboxes.length; i += 1) {
      emit([d.gboxes[i], d.name], null);
    }
  }
}

This view builds a composite key index by the geobox string and the place name.

Load the view to CouchDB. Note that Javascript function is not validated until next query, and you will get strange error messages if it does not parse or execute correctly. Be careful!

>>>from couchdb.design impor ViewDefinition
>>>viewfunc = 'function(d) { if(d.name) { var i; for (i = 0; i < d.gboxes.length; i += 1) { emit([d.gboxes[i], d.name], null); }}}'
>>>view = ViewDefinition('places', 'around', viewfunc)
>>>view.sync(places)

4. Materialize View

CouchDB indexes views on first query and the first query will take a long time in this case. This is because CouchDB does not update index on insert, so after bulk import index building will take some time. Monitor the progress on Futon status page. (http://localhost:5984/_utils/status.html).

Init indexing by simple query.

>>> list(places.view('places/around', limit=1))
[<Row id='123456', key=['65.00#25.05', u'Some Place'], value=None>

Note that we call ‘list’ for the query to force execution. The view member function returns just generator. The limit is one to return one entry to verify success.

5. Making Queries

Now we can search places by name and location. To do that, lets compute first the geoboxes for a location.

>>> geonbr(D('60.198765'), D('25.016443'), D('0.05'))
['60.15#25.05', '60.20#25.00', '60.20#25.05', '60.15#25.00']

To search all locations in single gebox, use query like this:

list(places.view('places/around', startkey=['60.20#25.05'],
                                  endkey=['60.20#25.05',{}]))

To search by place name in a geobox, just include the search term both in start and end keys. The search term in endkey is appended with high Unicode character to define upper bound.

list(places.view('places/around', startkey=['60.20#25.05', 'Ha'],
                                  endkey=['60.20#25.05', 'Ha'+u'\u777d']))

Define simple helper function

def around(box, s):
   return list(places.view('places/around', startkey=[box, s],
                                            endkey=[box, s+u'\u777d']))

Now, to search all places around location that start with search term (e.g. here ‘H’), call the around function for each geobox for that location.

>>> l = geonbr(D('60.19'), D('25.01'), D('0.05'))
>>> for gb in l:
...     around(gb, 'H')
...
[<Row id='659403', key=['60.15#25.05', 'Haakoninlahti'], value=None>, <Row id='658086', key=['60.15#25.05', 'Hevossalmi'], value=None>]
[<Row id='659403', key=['60.20#25.00', 'Haakoninlahti'], value=None>, <Row id='6545255', key=['60.20#25.00', 'Herttoniemenranta'], value=None>, <Row id='658132', key=['60.20#25.00', 'Herttoniemi'], value=None>, <Row id='651476', key=['60.20#25.00', u'H\xf6gholmen'], value=None>, <Row id='6514261', key=['60.20#25.00', 'Hotel Avion'], value=None>, <Row id='6528458', key=['60.20#25.00', 'Hotel Fenno'], value=None>, <Row id='798734', key=['60.20#25.00', 'Hylkysaari'], value=None>]
[<Row id='659403', key=['60.20#25.05', 'Haakoninlahti'], value=None>, <Row id='6545255', key=['60.20#25.05', 'Herttoniemenranta'], value=None>, <Row id='658132', key=['60.20#25.05', 'Herttoniemi'], value=None>, <Row id='658086', key=['60.20#25.05', 'Hevossalmi'], value=None>]
[<Row id='659403', key=['60.15#25.00', 'Haakoninlahti'], value=None>, <Row id='651476', key=['60.15#25.00', u'H\xf6gholmen'], value=None>, <Row id='6514261', key=['60.15#25.00', 'Hotel Avion'], value=None>, <Row id='6528458', key=['60.15#25.00', 'Hotel Fenno'], value=None>, <Row id='798734', key=['60.15#25.00', 'Hylkysaari'], value=None>]

Our geobox resolution (0.05) guarantees minimum search radius 2.5km and maximum 7.5km. We could use several resolutions, more boxes or always search from location box and 8 boxes around the location to improve results.

Note the duplicates that you have to filter out in memory. Now it’s simple thing to fetch the interesting places and compute what ever presentation you want to give to the user.

>>> q = places.view('_all_docs', keys=['659403', '658086'], include_docs=True)
>>> for row in q:
...     print row.doc
...
<Document '659403'@'1-5f7fe8f63ae034ea9562c20a8c9b6ae7' {'gboxes': ['60.15#25.05', '60.20#25.00', '60.20#25.05', '60.15#25.00'], 'loc': {'lat': '60.16694', 'lon': '60.16694'}, 'name': 'Haakoninlahti', 'areas': ['Europe', 'Helsinki']}>
<Document '658086'@'1-ecaf156721b392411f025a3b00e27d62' {'gboxes': ['60.20#25.05', '60.15#25.05'], 'loc': {'lat': '60.16167', 'lon': '60.16167'}, 'name': 'Hevossalmi', 'areas': ['Europe', 'Helsinki']}>

Filed under Experimental

Newer posts →

Brave New Method