100% Coverage: too much and not enough

Code coverage percentage is a controversial subject: some will tell you that you get diminishing returns after 70-80% and it’s not worth bothering, others will say that 100% is not enough. The underlying problem is often the approach taken to testing:

100% coverage as a goal is counterproductive

  • Code coverage tools tell you how much of your code has been tested, not how well that code has been tested.
  • Code coverage only applies to the code you have written, and doesn’t highlight the features you’ve forgotten to implement.
  • Achieving high code coverage with low quality is relatively simple, but only results in the code being run, not being tested. It’s possible to hit 100% code coverage with 0 assertions!

If features drive your testing, you are more likely to find missing code paths, and to write assertions for the features. If coverage drives your testing, your are likely to write low quality, brittle tests that satisfy the coverage tool – missing assertions and missing omissions in your logic. This is a problem because:

  • Your customer doesn’t care about your coverage, hitting 100% doesn’t help them if your code is still full of bugs. Tests should focus on surfacing bugs, not satisfying tool output.
  • You lose insight into the areas that need to be more thoroughly tested: when 100% of your code is covered by low quality tests, your code coverage tool becomes useless.

100% unit test coverage != 100% test coverage

When looking at the test pyramid, it’s easy to assume that if you’re aiming for 100% test coverage, that means you should be hitting 100% coverage with your unit tests, ~60% with component tests, ~30% with integration tests etc. This misses the point of the tests pyramid: that each layer higher in the test pyramid performs a different function to the layer below, and should fill in the gaps that the layer below wasn’t able to cover.

Unit testing glue-code with tonnes of mocks is possible, but the effort to value ratio is pretty bad: you’ll end up duplicating the same tests at a higher level anyway, and then have more tests to fix when you change that code later.

Not all code is equal

Some code is much more important than other code. Your focus should be on testing the important code thoroughly, not in wasting effort setting up mocks for the less important code.

Some projects have more risk than others. Are you writing banking software for millions of paying users, or a side project for fun? Scale your testing time appropriately.


Use your product requirements to drive testing. Use the risk factors of your product to determine how thorough your tests need to be. Use your code coverage tool as a tool, to tell you where best to spend your limited amount of time, not as a goal to give you a false sense of security.

Related reading:

Controllers are the gatekeepers of your API

Controllers safely separate your API from the outside world; they parse and sanitise data that comes in, and they filter the data you return.
For this reason, you shouldn’t do any input validation past your controllers: if it doesn’t exist by that point then something has gone wrong, and you should fix the code instead of adding hundreds of validations.

If your endpoint requires an account ID with your request and the user doesn’t pass it in, that’s a user error and you should handle it by validating the input and rejecting the request.
If your internal code requires an account ID and gets called without it however, that’s an error with your code, and your code should fail, rather than specifically checking for it and handling it in every single function it may occur.

For the same reason, controllers should abstract the request information from the rest of the code, and should be the last point in your code that you ever see the request object. Internal code beyond the controllers should have no concept of a request, only the values parsed in.

Express parameter callbacks

Handy little feature I didn’t know about in Express:
Using app.param([name], callback) you can bind callbacks directly to route parameters, allowing you to move common preprocessing/validation out of each function that uses the parameter, and into a single function (without having to call it explicitly each time.)
You can pass in an array of names, using next() to jump to the next parameter, and the callback is only called once regardless of how many times the parameter appears in route handlers.
The callbacks are local to the router they are defined on, so you can handle things (or not) differently based on the context.

Developing a node app in docker


We want to rapidly develop our node app inside a docker container, being able to install modules, make code changes, and see instant results. The problem is that while the official node image supports a handy onbuild feature which will grab the package.json and install everything we need, this also means having to rebuild the image every time a dependency changes.


Use this image, which places the node_modules folder one level higher, meaning it isn’t overwritten by the docker mount, and you can still mount and install your own node modules on the fly.


An easier but less obvious way to solve the problem is to specify your /usr/src/app/node_modules folder as a volume with no mapping to the host. This preserves the container copy and allows you to keep your local copy.


volumes: – /usr/src/app/node_modules

When you’re deploying the image and need to copy the entire app in, you can use the .dockerignore file to prevent your host node_modules from being loaded into the build context, improving build time

NGINX Timer Resolution

Using ngx.time or ngx.now is encouraged over using Lua’s built in functions because they use the cached time rather than performing a syscall, but how often is the cache updated?

After a bit of a dig, it turns out there’s no absolute answer, because the cache is actually updated when a kernel event fires:

It can be set manually using nginx’s timer_resolution
but this is not recommended because it either causes too many syscalls if set too low, or necessary time lag if set too high.


Openresty Redis ZUNIONSTORE gotcha


ZUNIONSTORE merges multiple sorted sets into one, and stores the result under the key specified. Since the number of keys can vary and there are more parameters after the keys, it require the numkeys parameter to be specified before the keys.

If using the default aggregate function (SUM) this is fine, as you can simply store the sorted set names to be merged in a table, use the size of the table as numkeys, and unpack() the table to pass all of the keys names to redis.

The problem occurs when you want to change the aggregate function, adding more parameters after unpack(), which changes its behaviour:

‘Lua always adjusts the number of results from a function to the circumstances of the call. When we call a function as a statement, Lua discards all of its results. When we use a call as an expression, Lua keeps only the first result. We get all results only when the call is the last (or the only) expression in a list of expressions.’

So unpack will only pass the first sorted set key name.


The workaround is to add the aggregate command to the list of sorted set key names, and deduct the number of keys passed:

table.insert(setNames, ‘AGGREGATE’)
table.insert(setNames, ‘MAX’)
local ok, err = red:zunionstore(‘destinationKey’,#setNames-2,unpack(setNames))

Remote Redis: Spiped vs Stunnel

Redis is fast, there’s no doubt about that. Unfortunately for us, connecting to Redis has an overhead, and the method you connect with can have a huge impact.

Connecting locally

Our options for connecting locally are Unix sockets or TCP sockets, so let’s start by comparing them directly:

Socket vs TCP:

As we can see, the higher overhead of TCP connections limits the throughput. By pipelining multiple requests through single connections, we can reduce the TCP setup overhead and get performance approaching that of sockets:

Socket vs tcp with pipeline of 1000:

Connecting over the network:

When we connect over the network, we have no choice but to use TCP sockets, and since redis has no network security, we need to secure our connections.

Our options for secure connections are stunnel and spiped, let’s test them both out.

Spiped vs stunnel:

As we can see, spiped seems to be hitting some kind of bottleneck, limiting the numbers regardless of the tests performed. The problem here appears to be that spiped pads messages:

[spiped] can significantly increase bandwidth usage for interactive sessions: It sends data in packets of 1024 bytes, and pads smaller messages up to this length, so a 1 byte write could be expanded to 1024 bytes if it cannot be coalesced with adjacent bytes.

So when we’re doing a large number of small requests with redis-benchmark, each small request is padded out to make it much larger, maxing out our bandwidth:

Like with unix sockets vs tcp, this improves when we use pipelining, as less bandwidth is wasted to padding:

Spiped vs stunnel, pipeline 1000:

There’s still a gap, but it’s much narrower now.


So what’s the solution? If you can, have your application on the same server as Redis, so that you can use Unix sockets for performance.
If you have to run over the network, bear in mind the overhead of spiped when sending large numbers of small requests.
Pipelining can have a huge impact, performing better over the network than none-pipelined locally. The issue is that not every application can neatly bundle all requests into pipelined chunks, so your result may vary depending on use case.

All tests were performed between two Kimsufi ks-5 dedicated servers, with a 100mb link.

Redis notes

I decided the tidy up the redis docs and I wrote some notes for myself on the way:

Redis as pure cache:

Setting maxmemory and maxmemory-policy to ‘allkeys-lru’ will make redis auto expire all keys, starting with the oldest first, without any need for manually setting EXPIRE. Perfect when used just for caching.

Lexicographical sorted sets:

Elements stored under the same key in a sorted set can be retrieved lexicographicaly, powerful for string searching. If you need to normalise a string while retaining the original, you can store them together. E.g. ‘banana:Banana’ to ignore case while searching but preserve the case of the result.

Distributed Locks

Getting distributed locks safely is more complicated than it first appears, with a few edge cases that may cause locks to not be released, etc. Redlock has been written as a general solution, and has a large number of implementations in different languages.


  • Only returns extra info for tty, raw for all others
  • Can be set to repeat commands using -r <count> -i <delay>
  •  ‘–stat’ produces continuous stats
  •  Can scan for big keys with –big-keys (can be used in production)
  •  Supports pub/sub directly
  •  Can echo all redis commands using MONITOR
  •  Can show redis latency and intrinsic latency
  •  Can grab RDB from server
  •  Can simulate LRU load with 80/20 access rates


  • Slaves can chain (slave -> slave replication, doesn’t replicate local slave writes)
  • Master can use diskless replication, sends rdb directly to slave from mem.
  • Master can be set to reject writes unless a certain number of slaves are available.


  •  Clients can subscribe to sentinel pub/sub for events.
  •  Sentinels never forget seen sentinels
  •  Slaves can be given promotion priority to avoid or prefer them becoming masters.


  • DISCARD cancels the current queue
  • WATCH will cancel EXEC if the watched key has changed since the WATCH command was issued.


  • SUNION can take a long time for large/many sets.
  • The Lua debugger can be used to step through lua scripts line by line.
  • Total memory used can exceed maxmemory briefly, could be by a large amount but only if setting a large key.
  • If you are storing a lot of objects in a set, split the key apart and use the first part as a hash key instead -> more memory efficient. (‘test1234’ -> ‘test1’ ‘234’ <value>)
  • Publishing ignores database members
  • Subscribing supports pattern matching
  • Clients may receive duplicated messages if they have multiple subscriptions
  • Keyspace notifications can report all commands affecting a key, all keys receiving lpush, and all keys expiring in db 0.
  • Expired keys only fire when they are actually expired by redis, not the exact time they should expire.

Redis Cluster vs Redis Replication

While researching Redis Cluster I found a large number of tutorials on the subject that confused Replication and Cluster, with people setting up ‘replication’ using cluster but no slaves, or building a ‘cluster’ only consisting of master-slave databases with no cluster config.

So to clear things up:


Replication involves a master server which serves reads and writes, and duplicates all data to one or more slave servers (which serves reads but not writes). Slaves can be used to replace a master in case of failure, spread read request load, or to perform backups of the database to reduce load on the master.


Clusters are used when you have more data than RAM in a single machine: the data is automatically split (based on the key) across multiple databases, increasing the amount of data you can store. Clients requesting a key from any cluster node will be redirected to the node holding the key, and are expected to learn the locations of keys to reduce the number of redirects.

Replicaton + Cluster

Redis Cluster supports replication by adding slaves to existing nodes, if a master becomes unreachable then its slave will be promoted to master.


Last but not least, Redis Sentinel can be used to manage replicated servers (not clustered, see below.) Clients connect to a Sentinel and request a master or slave to communicate with, the sentinels handle health checks of the masters/slaves, and will automatically promote a slave if a master is unreachable. You need to have at least 3 sentinels running so that they can agree on reachability of nodes, and to ensure the sentinels aren’t a single point of failure.

Cluster handles its own promotion and does not need Sentinel in front of it.

Installing Luasec part 2: Failed loading manifest


Tried to install LuaSec on a new machine recently and got the following error:

luarocks install luasec
Warning: Failed searching manifest: Failed loading manifest: Failed fetching manifest for http://luarocks.org/repositories/rocks – Error fetching file: Failed downloading http://luarocks.org/repositories/rocks/manifest – URL redirected to unsupported protocol – install luasec to get HTTPS support.

So all I need to do to install LuaSec is install LuaSec first,  brilliant!


One solution is buried here https://github.com/luarocks/luarocks-site/issues/6

and is to specify the server directly:

luarocks install –only-server=http://rocks.moonscript.org luasec