There is a good chance that your organization has automated tests. However, you might still face the following problems:
- Bugs keep coming back
- Developers are not confident to make changes in your system
- Developers are afraid to refactor the system
- Teams are horribly slow to add more features
If you do, then keep reading! There is a good chance that your investment in test automation has not delivered its value. There are many factors that can stop the organization from reaping its benefits. In this article, I'll be writing about some of these that I have encountered on various projects.
Consumers of Test Results
Symptoms to look out for:
Development team is not looking at the results.
Stop any work on the automation suite. Figure out why nobody consumes it and then fix those issues first. There is nothing worse than building the wrong features.
Creation of Test Data
How data should be used:
- Tests need to be independent and repeatable. So they should create and delete their own data.
- It is easy to set up objects for any of your testing needs, even if the data structure is complex.
- All team members are able to create their own data when they need it.
Symptoms to look out for:
- Your tests rely on a specific state in the database or other services, but have no control over it.
- Tests can’t be spread to other environments, without setting up data manually.
- Teams are used to tests failing and assume they are due to broken test scripts rather than a broken application.
Invest into making data easily available for anyone to use. It should support various combinations and maturity levels. Data is often required to drive development, manual and automated testing, showcasing new features, etc. It is an investment for the whole team.
Maintaining your solution is the difficult part. Frequent data dumps from production can be used to get relevant test data. This approach is not so useful if new features are created where there is no data in production yet. This is the time to come up with a reusable solution. Ideally, you advocate for building an application that makes it easier to build this solution.
Wotif has invested in a data creation solution and had presented its benefits at a local meetup. They built it keeping the concept of data-as-a-service in mind - a UI and API abstracted out a fast moving code base. It provided the capability of reusing existing data in scenarios where that was possible. And, it allowed for data to be created on the fly, but with the downside of slower response times. You might say that in their special circumstances that was possible, but in your world, maybe not. Let me give you more context of their complexity: their business is in finding and selling hotel rooms. These rooms need to be setup in different hotels, in various sizes, with existing bookings at different times, with various prices and specials. Any developer knows how complex that data model quickly becomes. They were able to do it, that means you can too!
It is easy to imagine that most of your company could benefit from it. Especially when the ecosystem is complex.
Metrics for Test Automation
What it should measure:
Metrics measure valuable outcomes. They are used like a temperature gauge - information on which you can base your actions on.
Symptoms to look out for:
There are many symptoms of using incorrect metrics. It is important to understand if this information will help take the right decisions. One example is test coverage of manual test cases - this is common in organizations that rely mostly on manual testing practices. It is an easy one to measure, but drives test automation into an UI heavy automation suite. It will provide great test coverage in the short term, but will heavily increase feedback time and maintenance cost in the long term.
Here are just a few examples of metrics that I find valuable:
Feedback time (how long does it take to get test results?):
This matters as it drives people's behavior. In my experience, if the test suite takes too long to run, then developers stop waiting for the results. They don’t run the tests as often, which means that they check in their code without getting feedback using your full test coverage. There is not one duration to test against. Instead, look for a trend over the period of the project. Watch out for sudden increases. They usually indicate that someone is forgetting to stub out a DB or adding UI tests without following test pyramid principles. In general, it is enough to care about this metric and to challenge any test that costs a lot of time.
For a standard web-application, it is reasonable to ask for a decent code coverage. If I had to give you a number, I’d say somewhere around 80%. So once the full automation suite has been run, then at least 80% of your lines of code have been executed. Very simple code (getters/setters) are not necessary to test, if they do not contain any logic. So you should leave some room for developers to make their own decisions and not force them into useless tests. Hence you should accept less than 100% code coverage.
This metric is ideal if it is used like a temperature gauge. If there are components in your system with a significantly lower test coverage than 80%, then it should be a trigger for further investigation. The actions from this metric should target the core problem. To do this, don’t ask your team to increase the code coverage. Try to understand what is not tested. Why is this not tested? Does it need to be tested? It’s usually harder to get more coverage on lower levels. Maybe the team likes stubbing a bit too much and forgets to add a few integration tests. On the other side, a few UI tests can get you great coverage very quickly, which you don’t want to rely on either. So derive your actions from this analysis with your team. The metric should then react.
In terms of measurement, it is used to compare the amount of tests on one layer with the amount of tests in the other layers around it. The pyramid shape will form if you have more tests below the current layer. If you have different shapes forming, then that will help you target a certain type of test to drive towards the pyramid shape.
- There are many more metrics. Luckily during the time of writing this article, Jon Jeffries has published a great article on it - if you are keen for more.
What they should allow you to do:
Average time spent to identify the following tasks takes only a few minutes:
- Identifying the lines of code that caused a problem in any environment along the path to production
- Finding the code commit that caused the issue
- That person usually remembers what might have caused the issue so any fix is a quick task. A re-test validates the same in a matter of minutes or hours.
Symptoms to look out for:
I have seen integrated environments that were shared by multiple teams. This was the first environment that was used for automated and manual testing. A failure could have been caused by any of these teams. This had a huge impact on the value of test automation as the results were not trusted. Environments are rare and their quality is low. There is no trust in them. Your environments and/ or your deployments are performed manually each time.
It’s all about controlling change. The more change there is, the harder it gets to identify the root cause for any failure. So you need to break down your path to production, make lots of incremental changes and deploy often in order to be able to control change.
Continuous Integration and Deployment
- Get a fresh copy of the project on your local developer machine and run the tests. They should be all green. Then make local changes and run the suite again. If there are failures, then it clearly was due to your local changes. You can now fix your local code before you impact anyone else. Finding the root cause and fixing it is now very easy, since you have limited all changes to yourself. Sounds too easy right? Let’s apply the same to the next step.
- Check in your code into an environment that its re-deployed and tested with every check in: your CI environment. This will highlight any issues with your code when it is deployed automatically to a more production like environment. Manual changes to your local machine would cause failures, for example. This principle seems simple, but is very powerful.
- Continue to apply this principle by integrating and testing more components until production. I’d recommend to isolate the following changes:
- Making changes to the environment (e.g. security patches, software updates)
- Integrating code from other teams of your organization
- Integrating external components that you have no control over
It is not required to run the same tests repeatedly at each step. Be selective. Focus on tests that provide coverage around the area of change. By doing that, we are building trust in the system as we have tested each change on the path to production.
When you should write tests:
Test automation has been part of the development of this product from the start. Existing tests represent a testing pyramid. New tests are written as new features are being built.
Symptoms to look out for:
- You started investing in test automation once the manual testing effort was getting too slow and too expensive. Your software does not favor test automation. It is expensive and hard to retro-fit effective test automation.
- Excessive UI tests have been written to get enough coverage.
Get some basic coverage by automating crucial user journeys via the UI. Follow the 80/20 rule. Delete duplicate UI tests that only cover minor functionalities. This hurts - but it’s the only way to get some return on your investment. This will give you a safety net quickly, that is actually useable. There may be some hook in points to write integration tests, which is some leverage for more coverage in that area. Stop there to write any more tests for this system.
Going forward though, you can make sure that any new components are written with test automation in mind. New code has to be structured to support automated tests on all levels. This will be a good foundation for a solid test suite in the future.
Do you have other recommendations or areas to watch out for? Please comment below.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.