median blog

Avoid testing just to say you test

I recently spent time looking at tests in a mature node code base. There are around three thousand across unit and integration tests. Due to the way these tests were written, they can only be run sequentially, leading to a run time of twenty minutes.

I spent time migrating the tests so they could run in parallel, bringing the execution time down to under a minute. There are still a handful of tests to go, but along the way I found some interesting scenarios I thought I would share.

The code base is private so I will talk about scenarios in a general way, I hope the lack of specifics does not hinder the message.

Tests that only work for an exact set of objects.

By far the most common scenario was this one. In good form, the tests set up a series of objects, performed and action and made some assertions about the results. One of the major barriers to these tests passing when run in parallel was that they relied very heavily on the exact set of objects setup before the test.

To give a concrete example, several tests around filtering data passed in series but failed in parallel when more data was in the database. In these cases the filtering did not in fact work as intended but happened to work on the small set of data provided.

Tests that reuse unique IDs

Many of the tests that failed in parallel failed because the setup stage was shared between tests. While this is not inherently a bad idea, the setup used hard coded values for things like unique IDs. This caused errors when setting up two tests in parallel due to uniqueness constraints in the database.

Often these unique ids were chosen to be somewhat human readable, 'deleted-customer', 'new-customer' etc. Shifting the labeling of these values to variables and using random values like UUIDs kept the readability while allowing the tests to act independently.

Shallow tests

I am not sure if there is a better term for this one but the scenario here is tests that ultimately just test functions in another library. In this case, testing that calls to the ORM behave in the way the ORM describes.

I can see some limited value in these tests, it is possible to accidentally use a find_one instead of a find_many but these tests feel like a waste in many cases.