menü

Works on my machine… and also everywhere else: local build and testing environments as code

Whether it’s a build environment for compiling and testing the application, or a test environment with other parts of the stack for integration, or end-to-end testing, developers waste an extraordinary amount of time setting up and maintaining environments.

Thanks to Docker and a tool called batect, I’ve successfully eliminated this waste on a number of teams.

How much wasted time are we talking about?

In a recent survey of my colleagues, I found that they can spend anywhere from two hours to two weeks getting up and running when joining a new project, and up to a person-week each month maintaining test environments. And then there’s the cost of dealing with flaky tests, with that flakiness frequently caused by inconsistent environments. Even Google is not immune to these issues - in 2016, they found that 84% of new test failures were due to flakiness.

What is the solution?

There are a number of existing solutions out there, but each of them has drawbacks:

  • Documenting the desired setup requires discipline when starting a project and when making changes - not only do you have to remember to update the documentation, but your colleagues also need to update their environment each time a change is made. And manually running through all of the setup steps can be very time-consuming.
     
  • Scripting the desired setup (with either a shell script or something like Ansible) doesn’t enforce that no further changes are made after the script is run (or that the script is even run in the first place). And other configuration not controlled by the script can interfere with the desired state - each machine is still a snowflake.
     
  • Using virtual machines (usually with something like Vagrant) is very resource intensive, and the performance impact of running tasks inside a VM can be significant. It’s often not practical to run these VMs on CI.
     
  • Shared test and integration environments make testing and debugging incredibly painful. Not only is the cycle time from code change to running in a test environment far too long, but you have to be careful not to step on your colleagues’ toes as they use the environment as well (and vice versa).
     
  • Vanilla Docker and Docker Compose don’t provide a natural way for a developer to express these environments and place a lot of the configuration and setup effort on developers, rather than taking care of things for them. Their command-line interfaces also aren’t optimised for this scenario.

So, how can we do better?

We can borrow some ideas from our production environments, namely Docker and the principle of infrastructure as code, and combine them with a tool purpose-built for this scenario: batect.

Applying the concept of infrastructure as code to our local development and testing environments gives us a number of benefits:

  • It is easier to distribute and version our configuration alongside the application as it is expressed as executable code, not documentation.
     
  • The cost of changing an environment is significantly reduced - no longer does the team need to spend time communicating about and applying a change to all of their environments as it’s done for them automatically.
Choosing Docker gives us a way to run our environments that has none of the overhead of virtual machines, is performant, isolated, and already familiar to many developers. We can also take advantage of the many existing images, and either use them as-is or as a base for a customised setup.

And batect makes it really easy to define these environments and orchestrate the various setup steps required and do all of this in a repeatable and very fast way that can be used both on developer machines and CI. It is designed with this use case in mind, with a command line interface and configuration file syntax tailored for this scenario.

How do these things come together?

Let’s run through two examples, starting with a simple one first: building and unit testing our application. (If you prefer to watch things in action, take a look at these screencasts of building a Java application and running the unit tests for a Ruby app.)

Architecture Diagram

First, we need to pull or build the image that defines our build environment (if we haven’t already), which will include things like our build tools and runtimes. (If we were building a JVM-based application, this might be the JVM, or the Ruby runtime and Bundler if we were using Ruby.)

Next, we need to create a container, mounting our code into it so that it is visible to our tools. Then we run the build and unit tests inside that container, and then destroy the container once everything is finished.

You might be wondering why we bother to create a fresh container every time, rather than just reusing the container from the last build and test run. The reason is that by creating a new container each time, we are guaranteed to get a known, clean and consistent environment every time with no possibility of configuration drift. The overhead of doing so is minimal, especially when we consider the benefits, and it also means we pick up any configuration changes straight away - for example, if we’ve switched to a different image or changed the Dockerfile that defines our environment, we’ll start using it from the next build. This means your colleagues might not even notice that you’ve switched from one version of a component to another, or even switched from NPM to Yarn.

Let’s take a look at another example: running end-to-end tests on our application. (Again, if you prefer watching things, take a look at this screencast of batect in action running the journey tests for the Java sample application.)

Let’s assume we’ve built a shopping cart service that relies on a pricing service maintained by another team and also has its own datastore to keep track of customers’ shopping carts. If we want to test this end-to-end, we need four components: the shopping cart service, a real or fake implementation of the pricing service, a real or fake datastore, and some kind of test driver. (If we were building a UI, this test driver might be headless Chrome and Selenium, or if we were building an API, this test driver might be something as simple as JUnit or RSpec.)

Orchestrating tests like these involves a number of steps. We need to:

  • Build the shopping cart service (#1 in the diagram below)
     
  • Build or pull the images for each of those four components (#2)
     
  • Start each component in the correct order - we don’t want to start the shopping cart service until both the pricing service and datastore are up and ready to receive requests
     
  • Run the tests (#3)
     
  • Finally: clean everything up.

Architecture Diagram

The combination of this approach and Docker gives us two other great benefits. Firstly, we can use Docker’s networking features to run everything in an isolated network (#4 in the diagram above), removing the possibility of port conflicts and giving us an easy way to address each component on the network. Secondly, we can quickly and easily spin up a local test environment and run integration and end-to-end tests on developer machines and on CI as well.

This all sounds great, and it’s a technique I’ve used on a number of different projects across a number of different clients - whether it’s a Golang backend or an Android app, setting up your development and testing environments in this way frees up developers to focus on the most important thing: creating value for users.

But - you knew there was a ‘but’ coming - the existing tooling out there doesn’t support this use case particularly well, as we talked about earlier. So, I built a tool called batect that is designed with this use case and its users (developers) in mind. batect:

If you want to start using batect in your own projects right away, take a look at one of the sample projects which you can use to learn more about the tool, or as a starting point for your own setup.