Elasticsearch geohash grid vector tiles with Node.js

October 24, 2020

Although there's an ongoing effort to add vector tiles support in Kibana, I couldn't find an easy way to make my Node.js application expose its geohash aggregation results as vector tiles. In this post I'm outlining my solution for encoding geohash aggregation buckets as Mapbox Vector Tiles. I'm only discussion parts of the code but the entire solution is available from GitHub or npm.

The figure below summarizes what we are dealing with. For each map tile we will query Elasticsearch and apply a geohash aggregation, taking into account that some geohashes extend beyond the tile boundaries. We then need to calculate the geographic coordinates for each geohash, and convert to pixel based tile coordinates. And finally these coordinates need to be encoded following the MVT spec.

tile

Calculating the query bounding box

The bounding box of our query needs to fully encompass all geohashes that intersect with our tile. The tile bounding box can be calculated from x, y, and zoom using the global-mercator package. For the top left and bottom right corners of the bounding box we then calculate the geohashes they are located in, taking into account the geohash precision we will use in the aggregation. This can be done with the encode() function from the ngeohash package. From these geohashes we can calculate the bounds of our query.

const globalMercator = require("global-mercator");
const ngeohash = require("ngeohash");

const [ minLon, minLat, maxLon, maxLat ] = globalMercator.googleToBBox(xyz);

const topLeft = ngeohash.encode(maxLat, minLon, precision);
const bottomRight = ngeohash.encode(minLat, maxLon, precision);

const [ , minLon2, maxLat2, ] = ngeohash.decode_bbox(topLeft);
const [ minLat2, , , maxLon2 ] = ngeohash.decode_bbox(bottomRight);

This envelope can be used as a filter in the Elasticsearch query:

{
    "geo_bounding_box": {
        "location": {
            "top_left": [ minLon2, maxLat2 ],
            "bottom_right": [ maxLon2, minLat2 ]
        }
    }
}

Encoding geohash buckets into a vector tile

Vector tiles are encoded as Google protocol buffers (PBF), a compact binary format for structured data serialization. In this case I'm using the the pbf library from Mapbox to generate protocol buffers. First we need to compile a JavaScript module from the vector_tile.proto file which defines the protocol buffer message. This file is part of the MVT spec.

const Pbf = require("pbf");
const Compile = require("pbf/compile");

const proto = fs.readFileSync(path.join(__dirname, "vector_tile.proto"));
const proto_schema = schema.parse(proto);
const mvt = Compile(proto_schema);

The tricky part of creating the object to write to the protocol buffer is encoding the geometries. In vector tiles, geometries are referenced in a pixel based coordinate system with the origin in the upper left corner. Geometries are encoded as a sequence of 32 bit integers, with each integer being either a command or a parameter. Command integers consist of a command ID (such as MoveTo, LineTo, or ClosePath) in the least significant 3 bits, and a command count in the remaining 29 bits. Parameters are zigzag encoded.

function commandInteger(id, count) {
    return (id & 0x7) | (count << 3);
}

function parameterInteger(value) {
    return (value << 1) ^ (value >> 31);
}

So for each bucket we first need to convert to geographic coordinates, and then to pixel coordinates:

buckets.forEach(bucket => {
    const hash = bucket.key;
    const [ minLat, minLon, maxLat, maxLon ] = ngeohash.decode_bbox(hash);
    const feature = new PolygonFeature();
    feature.addPolygon([
        self.coordsToPixels(minLon, maxLat),
        self.coordsToPixels(maxLon, maxLat),
        self.coordsToPixels(maxLon, minLat),
        self.coordsToPixels(minLon, minLat)
    ]);
    layer.addFeature(feature);
});

For the conversion from geographic coordinates to pixel coordinates we can make use of the handy pointToTileFraction() function from the global-mercator package.

const tileSize = 4096;

coordsToPixels(lon, lat) {
    const fraction = globalMercator.pointToTileFraction([lon, lat], xyz[2], false);
    const x = Math.floor((fraction[0] - xyz[0]) * tileSize);
    const y = Math.floor((fraction[1] - xyz[1]) * tileSize);
    return [ x, y ];
};

We now have pixel based coordinates, which we can encode to a sequence of integers. Our cursor starts at [0, 0], and from that position we MoveTo the first corner coordinate. For each subsequent corner coordinate we calculate a delta and add a LineTo command, until we arrive back at the beginning and append a ClosePath command. Our final position is where we will start from when the next polygon is added.

this.geometry = [];
this.cursor = [0, 0];

addPolygon(coords) {
    let delta = [ coords[0][0] - cursor[0], coords[0][1] - cursor[1] ];
    this.geometry = this.geometry.concat(commandInteger(MOVETO, 1));
    this.geometry = this.geometry.concat(encodePoint(delta));
    this.cursor = delta;
    coords.shift();
    this.geometry = this.geometry.concat(commandInteger(LINETO, coords.length));
    coords.forEach(xy => {
        delta = [ xy[0] - this.cursor[0], xy[1] - this.cursor[1] ];
        this.cursor = xy;
        this.geometry = this.geometry.concat(encodePoint(delta));
    });
    this.geometry = this.geometry.concat(commandInteger(CLOSEPATH, 1));
}

Once our tile object is ready, we can write it to a protocol buffer using the mvt object we compiled earlier:

generateBuffer() {
    const pbf = new Pbf();
    mvt.Tile.write(this, pbf);
    return Buffer.from(pbf.finish());
}

Encoding attributes

I did not touch on feature attributes yet and will not go into detail here, but attributes are encoded as a series of tags, where each tag points to either a key or a value. Keys and values are stored at the layer level.

Visualizing vector tiles

The repository includes example code featuring an Elasticsearch Docker container, a simple tile server, and a visualization using Mapbox GL JS. The tile server generates vector tiles on the fly based on the geohash precision included in the request URL. Features can be encoded as polygons but also as points. In the example below the results from a coarser aggregation are encoded as polygons, and combined with the results from a more granular aggregation encoded as points.

geohash map