Express parameter callbacks

Handy little feature I didn’t know about in Express:
Using app.param([name], callback) you can bind callbacks directly to route parameters, allowing you to move common preprocessing/validation out of each function that uses the parameter, and into a single function (without having to call it explicitly each time.)
You can pass in an array of names, using next() to jump to the next parameter, and the callback is only called once regardless of how many times the parameter appears in route handlers.
The callbacks are local to the router they are defined on, so you can handle things (or not) differently based on the context.
Neat!

Developing a node app in docker

Problem

We want to rapidly develop our node app inside a docker container, being able to install modules, make code changes, and see instant results. The problem is that while the official node image supports a handy onbuild feature which will grab the package.json and install everything we need, this also means having to rebuild the image every time a dependency changes.

Solution

Use this image, which places the node_modules folder one level higher, meaning it isn’t overwritten by the docker mount, and you can still mount and install your own node modules on the fly.

NGINX Timer Resolution

Using ngx.time or ngx.now is encouraged over using Lua’s built in functions because they use the cached time rather than performing a syscall, but how often is the cache updated?

After a bit of a dig, it turns out there’s no absolute answer, because the cache is actually updated when a kernel event fires:

It can be set manually using nginx’s timer_resolution
http://nginx.org/en/docs/ngx_core_module.html#timer_resolution
but this is not recommended because it either causes too many syscalls if set too low, or necessary time lag if set too high.

https://groups.google.com/d/msg/openresty-en/OdNzfh8rlzk/EQHF-hlhlQ0J

Openresty Redis ZUNIONSTORE gotcha

Problem

ZUNIONSTORE merges multiple sorted sets into one, and stores the result under the key specified. Since the number of keys can vary and there are more parameters after the keys, it require the numkeys parameter to be specified before the keys.

If using the default aggregate function (SUM) this is fine, as you can simply store the sorted set names to be merged in a table, use the size of the table as numkeys, and unpack() the table to pass all of the keys names to redis.

The problem occurs when you want to change the aggregate function, adding more parameters after unpack(), which changes its behaviour:

http://www.lua.org/pil/5.1.html
‘Lua always adjusts the number of results from a function to the circumstances of the call. When we call a function as a statement, Lua discards all of its results. When we use a call as an expression, Lua keeps only the first result. We get all results only when the call is the last (or the only) expression in a list of expressions.’

So unpack will only pass the first sorted set key name.

Solution

The workaround is to add the aggregate command to the list of sorted set key names, and deduct the number of keys passed:

table.insert(setNames, ‘AGGREGATE’)
table.insert(setNames, ‘MAX’)
local ok, err = red:zunionstore(‘destinationKey’,#setNames-2,unpack(setNames))

Remote Redis: Spiped vs Stunnel

Redis is fast, there’s no doubt about that. Unfortunately for us, connecting to Redis has an overhead, and the method you connect with can have a huge impact.

Connecting locally

Our options for connecting locally are Unix sockets or TCP sockets, so let’s start by comparing them directly:

Socket vs TCP:

As we can see, the higher overhead of TCP connections limits the throughput. By pipelining multiple requests through single connections, we can reduce the TCP setup overhead and get performance approaching that of sockets:

Socket vs tcp with pipeline of 1000:

Connecting over the network:

When we connect over the network, we have no choice but to use TCP sockets, and since redis has no network security, we need to secure our connections.

Our options for secure connections are stunnel and spiped, let’s test them both out.

Spiped vs stunnel:

As we can see, spiped seems to be hitting some kind of bottleneck, limiting the numbers regardless of the tests performed. The problem here appears to be that spiped pads messages:

[spiped] can significantly increase bandwidth usage for interactive sessions: It sends data in packets of 1024 bytes, and pads smaller messages up to this length, so a 1 byte write could be expanded to 1024 bytes if it cannot be coalesced with adjacent bytes.

So when we’re doing a large number of small requests with redis-benchmark, each small request is padded out to make it much larger, maxing out our bandwidth:

Like with unix sockets vs tcp, this improves when we use pipelining, as less bandwidth is wasted to padding:

Spiped vs stunnel, pipeline 1000:

There’s still a gap, but it’s much narrower now.

Conclusion

So what’s the solution? If you can, have your application on the same server as Redis, so that you can use Unix sockets for performance.
If you have to run over the network, bear in mind the overhead of spiped when sending large numbers of small requests.
Pipelining can have a huge impact, performing better over the network than none-pipelined locally. The issue is that not every application can neatly bundle all requests into pipelined chunks, so your result may vary depending on use case.

All tests were performed between two Kimsufi ks-5 dedicated servers, with a 100mb link.

Redis notes

I decided the tidy up the redis docs and I wrote some notes for myself on the way:

Redis as pure cache:

Setting maxmemory and maxmemory-policy to ‘allkeys-lru’ will make redis auto expire all keys, starting with the oldest first, without any need for manually setting EXPIRE. Perfect when used just for caching.

Lexicographical sorted sets:

Elements stored under the same key in a sorted set can be retrieved lexicographicaly, powerful for string searching. If you need to normalise a string while retaining the original, you can store them together. E.g. ‘banana:Banana’ to ignore case while searching but preserve the case of the result.

Distributed Locks

Getting distributed locks safely is more complicated than it first appears, with a few edge cases that may cause locks to not be released, etc. Redlock has been written as a general solution, and has a large number of implementations in different languages.

Redis-cli

  • Only returns extra info for tty, raw for all others
  • Can be set to repeat commands using -r <count> -i <delay>
  •  ‘–stat’ produces continuous stats
  •  Can scan for big keys with –big-keys (can be used in production)
  •  Supports pub/sub directly
  •  Can echo all redis commands using MONITOR
  •  Can show redis latency and intrinsic latency
  •  Can grab RDB from server
  •  Can simulate LRU load with 80/20 access rates

Replication

  • Slaves can chain (slave -> slave replication, doesn’t replicate local slave writes)
  • Master can use diskless replication, sends rdb directly to slave from mem.
  • Master can be set to reject writes unless a certain number of slaves are available.

Sentinel

  •  Clients can subscribe to sentinel pub/sub for events.
  •  Sentinels never forget seen sentinels
  •  Slaves can be given promotion priority to avoid or prefer them becoming masters.

Transactions

  • DISCARD cancels the current queue
  • WATCH will cancel EXEC if the watched key has changed since the WATCH command was issued.

Other

  • SUNION can take a long time for large/many sets.
  • The Lua debugger can be used to step through lua scripts line by line.
  • Total memory used can exceed maxmemory briefly, could be by a large amount but only if setting a large key.
  • If you are storing a lot of objects in a set, split the key apart and use the first part as a hash key instead -> more memory efficient. (‘test1234’ -> ‘test1’ ‘234’ <value>)
  • Publishing ignores database members
  • Subscribing supports pattern matching
  • Clients may receive duplicated messages if they have multiple subscriptions
  • Keyspace notifications can report all commands affecting a key, all keys receiving lpush, and all keys expiring in db 0.
  • Expired keys only fire when they are actually expired by redis, not the exact time they should expire.

Redis Cluster vs Redis Replication

While researching Redis Cluster I found a large number of tutorials on the subject that confused Replication and Cluster, with people setting up ‘replication’ using cluster but no slaves, or building a ‘cluster’ only consisting of master-slave databases with no cluster config.

So to clear things up:

Replication

Replication involves a master server which serves reads and writes, and duplicates all data to one or more slave servers (which serves reads but not writes). Slaves can be used to replace a master in case of failure, spread read request load, or to perform backups of the database to reduce load on the master.

Cluster

Clusters are used when you have more data than RAM in a single machine: the data is automatically split (based on the key) across multiple databases, increasing the amount of data you can store. Clients requesting a key from any cluster node will be redirected to the node holding the key, and are expected to learn the locations of keys to reduce the number of redirects.

Replicaton + Cluster

Redis Cluster supports replication by adding slaves to existing nodes, if a master becomes unreachable then its slave will be promoted to master.

Sentinel

Last but not least, Redis Sentinel can be used to manage replicated servers (not clustered, see below.) Clients connect to a Sentinel and request a master or slave to communicate with, the sentinels handle health checks of the masters/slaves, and will automatically promote a slave if a master is unreachable. You need to have at least 3 sentinels running so that they can agree on reachability of nodes, and to ensure the sentinels aren’t a single point of failure.

Cluster handles its own promotion and does not need Sentinel in front of it.

Installing Luasec part 2: Failed loading manifest

Problem

Tried to install LuaSec on a new machine recently and got the following error:

luarocks install luasec
Warning: Failed searching manifest: Failed loading manifest: Failed fetching manifest for http://luarocks.org/repositories/rocks – Error fetching file: Failed downloading http://luarocks.org/repositories/rocks/manifest – URL redirected to unsupported protocol – install luasec to get HTTPS support.

So all I need to do to install LuaSec is install LuaSec first,  brilliant!

Solution

One solution is buried here https://github.com/luarocks/luarocks-site/issues/6

and is to specify the server directly:

luarocks install –only-server=http://rocks.moonscript.org luasec

Stuck installing debuginfo in Ubuntu

Problem:

To run systemtap you need debuginfo, but it fails when installing linux image with :

apt-get source linux-image-4.4.0-53-generic-dbgsym
Reading package lists… Done
Picking ‘linux’ as source package instead of ‘linux-image-4.4.0-53-generic-dbgsym’

And then fails to find ‘linux’

Solution:

Solution is to uncomment ‘deb-src’ in /etc/apt/sources.list, run apt-update again, and then

sudo apt-get build-dep –no-install-recommends linux-image-$(uname -r)

error: ‘struct module’ has no member named ‘symtab’

Problem:

Running system tap gives the error: ‘error: ‘struct module’ has no member named ‘symtab’’

Solution:

This is caused by a bug with system tap not containing symtab for ubuntu 16+ in version 2.9, and can be solved by upgrading to systemtap 3.0+ by compiling from source.