Enable javascript in your browser for better experience. Need to know to enable it? Go here.

Ephemeral test environments: Sometimes, you just need to kill your darlings

What happens when you need to scale your tests? A shared testing environment is usually the first port of call, providing teams with a single place to push code, creating a context for everyone. However, as things get more complex — more developers, additional components and a greater number of microservices — a single test environment is no longer fit for purpose. This is where ephemeral testing environments come in.

 

What is an ephemeral test environment?

 

An ephemeral test environment is an isolated and short-lived infrastructure stack where code can be tested quickly and easily. They’re isolated in the sense that they’re completely independent from every other part of your infrastructure (so have no dependencies), and short-lived in that they don’t need to live any longer than a discrete task requires — they can simply be spun up when required. They simplify the process and automation associated with your existing orchestration so it’s incredibly easy to both create and destroy them.

Why are ephemeral environments needed?

 

“As you continue to expand and things get older and you have more developers coming on,” explains Thoughtworks’ Effy Elden, “you start to run into tensions with having a single test environment that everyone deploys to before production because you deploy something there to test it.” 

 

In these instances, Elden goes on to say, the test environment becomes a bottleneck.

 

“Each team needs the environment to be exactly like production except for their component, which becomes impractical.”

 

Occasionally, organizations try to overcome this bottleneck by creating multiple test environments, where each team has their own dedicated environment whereby, Elden says, “they can deploy what they're working on and test it and share context without breaking for other teams.”

 

However, this creates a whole new set of challenges. “You then have the problem of ‘well, I've deployed the new version to my team's test environment and I've deployed it to production. But the other teams have an old version — do I deploy my new thing to their version? And do I remember to do that? Do I care?’” 

 

This means “either each team is responsible for keeping all parts of their environment updated with the actual status of prod or each team releasing something new has to ensure that it funnels back down to all these other test environments.”

 

While it’s possible to use automation to keep services up to date, this comes with a risk that when a team does something new — like, say, building a microservice in Kubernetes — the infrastructure automation you put in place can no longer keep up. 

Effy Elden, Thoughtworks
It requires a change to how we think about things... there's a trade-off at some point in the amount of pain that your teams are going through as a result of having highly shared environments that everyone's trying to keep alive instead of self-service.
Effy Elden
Security Specialist, Thoughtworks
It requires a change to how we think about things... there's a trade-off at some point in the amount of pain that your teams are going through as a result of having highly shared environments that everyone's trying to keep alive instead of self-service.
Effy Elden
Security Specialist, Thoughtworks

Ephemeral test environments in practice

 

Elden emphasizes that when we talk about ephemeral testing environments, we really do mean ephemeral. Mentioning an example of implementing ephemeral environments when working with a client, they stress that “we were really trying to encourage short lived… you spin it up for this particular Jira card, or to test this bug or do this change. It's not ‘create an ephemeral environment for my team’, because the longer they live, the more they will become out of sync and the more they can drift.”

 

In other words, it’s important to completely rethink your approach. “It requires a change to how we think about things. And I think there's a trade-off at some point in the amount of pain that your teams are going through as a result of having highly shared environments that everyone's trying to keep alive instead of self-service.”

 

The importance of automation and self-service

 

A further critical element of ephemeral environments is having the appropriate level of automation in place so teams can spin up testing environments quickly and easily. Elden gives the example of an automation platform Thoughtworks developed for a client built specifically to help ease this burden.

 

The manual process, Elden explains, typically took a few weeks and would usually take place every six months or so. The result would usually be something “very slow, full of errors,” they say. But with the automation platform this was massively reduced; spinning up a new environment went from weeks to just seconds.

 

“You could click a button and have one ready to go 37 minutes later. It really helped transform the release process from being booking your release slot in and having to go to test and hope it works there,” to something where “you can have confidence that when you click release it's going to deploy to test and tests will complete successfully.”

 

Elden also explains how this particular automation platform was built. “It was a very tactical approach of building some light automation to take all the manual bits and tie them together,” they say. Specifically, “a stack of lambda functions would spin up a Kubernetes cluster and deploy things to a shared Kubernetes cluster. It'd also spin up a CloudFormation stack that had an instance that had all the legacy monolith stuff on it. An orchestrator built into CI/CD which handled all those different things in parallel.”

When are ephemeral test environments not appropriate?

 

Although ephemeral test environments can be very effective in certain circumstances, this doesn’t mean they’re always the right option. “If you're building something shiny and new that isn’t heavily coupled and has a very clean domain boundary you might have no need,” Elden says. 

 

They’re more appropriate for “anything that touches on brown fields or touches on tech debt.” If you’re trying to split out a monolith, for example, “you're going to need to have a pretty heavy cloud test environment as opposed to being able to run things locally.”

 

Ephemeral environments and continuous previews

 

Ephemeral testing environments share many similarities with another concept that’s starting to be more widely discussed in the industry: continuous previews. 

 

According to Elden, the two concepts have a “strong overlap,” but emphasizes it’s worth distinguishing between the two. Continuous previews are, they say, “having a shared spot for upcoming functionality and features to be visible and tested.” This doesn’t just mean automated testing but also human tests — in other words, continuous previews, when leveraging ephemeral testing environments, can allow QAs (and others) to try out new things very quickly.

 

Elden explains how they work: “I make a pull request with a new feature on a branch and you automatically get a version of that deployed online at a separate URL. I can then go and actually see the new app, and anyone else can click in and view it and it's running in a shared context. And then, when I actually preview the changes for myself… you get that preview deployment.”

 

There are significant benefits to such a technique. “It makes it really easy for someone else to be able to look at the work you're doing without having to pull down your code and run it locally.” This could help team members who aren’t particularly technical to go and actually play with new features before it’s actually live. In turn, that means more efficiency in the development and testing process and, ultimately, higher quality software.

Simplifying testing when tackling complex development challenges

 

Ephemeral testing environments — and related techniques like continuous previews — can provide significant productivity and quality benefits. True, it does require a shift in mindset — the idea of creating environments that are really, for all intents and purposes, highly disposable might seem strange. However, from a software perspective, such a way of working can clearly deliver a long-term impact. And in an industry that is today obsessed with only viewing AI as the route to improvement, leveraging other techniques and approaches might well be just as critical as adopting what’s receiving the most attention and hype.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.

Discover fresh perspectives and new ideas on the Technology Podcast