ThoughtWorks
  • Contact
  • Español
  • Português
  • Deutsch
  • 中文
Go to overview
  • Engineering Culture, Delivery Mindset

    Embrace a modern approach to software development and deliver value faster

    Intelligence-Driven Decision Making

    Leverage your data assets to unlock new sources of value

  • Frictionless Operating Model

    Improve your organization's ability to respond to change

    Platform Strategy

    Create adaptable technology platforms that move with your business strategy

  • Experience Design and Product Capability

    Rapidly design, deliver and evolve exceptional products and experiences

    Partnerships

    Leveraging our network of trusted partners to amplify the outcomes we deliver for our clients

Go to overview
  • Automotive
  • Cleantech, Energy and Utilities
  • Financial Services and Insurance
  • Healthcare
  • Media and Publishing
  • Not-for-profit
  • Public Sector
  • Retail and E-commerce
  • Travel and Transport
Go to overview

Featured

  • Technology

    An in-depth exploration of enterprise technology and engineering excellence

  • Business

    Keep up to date with the latest business and industry insights for digital leaders

  • Culture

    The place for career-building content and tips, and our view on social justice and inclusivity

Digital Publications and Tools

  • Technology Radar

    An opinionated guide to technology frontiers

  • Perspectives

    A publication for digital leaders

  • Digital Fluency Model

    A model for prioritizing the digital capabilities needed to navigate uncertainty

  • Decoder

    The business execs' A-Z guide to technology

All Insights

  • Articles

    Expert insights to help your business grow

  • Blogs

    Personal perspectives from ThoughtWorkers around the globe

  • Books

    Explore our extensive library

  • Podcasts

    Captivating conversations on the latest in business and tech

Go to overview
  • Application process

    What to expect as you interview with us

  • Grads and career changers

    Start your tech career on the right foot

  • Search jobs

    Find open positions in your region

  • Stay connected

    Sign up for our monthly newsletter

Go to overview
  • Conferences and Events
  • Diversity and Inclusion
  • News
  • Open Source
  • Our Leaders
  • Social Change
  • Español
  • Português
  • Deutsch
  • 中文
ThoughtWorksMenu
  • Close   ✕
  • What we do
  • Who we work with
  • Insights
  • Careers
  • About
  • Contact
  • Back
  • Close   ✕
  • Go to overview
  • Engineering Culture, Delivery Mindset

    Embrace a modern approach to software development and deliver value faster

  • Experience Design and Product Capability

    Rapidly design, deliver and evolve exceptional products and experiences

  • Frictionless Operating Model

    Improve your organization's ability to respond to change

  • Intelligence-Driven Decision Making

    Leverage your data assets to unlock new sources of value

  • Partnerships

    Leveraging our network of trusted partners to amplify the outcomes we deliver for our clients

  • Platform Strategy

    Create adaptable technology platforms that move with your business strategy

  • Back
  • Close   ✕
  • Go to overview
  • Automotive
  • Cleantech, Energy and Utilities
  • Financial Services and Insurance
  • Healthcare
  • Media and Publishing
  • Not-for-profit
  • Public Sector
  • Retail and E-commerce
  • Travel and Transport
  • Back
  • Close   ✕
  • Go to overview
  • Featured

  • Technology

    An in-depth exploration of enterprise technology and engineering excellence

  • Business

    Keep up to date with the latest business and industry insights for digital leaders

  • Culture

    The place for career-building content and tips, and our view on social justice and inclusivity

  • Digital Publications and Tools

  • Technology Radar

    An opinionated guide to technology frontiers

  • Perspectives

    A publication for digital leaders

  • Digital Fluency Model

    A model for prioritizing the digital capabilities needed to navigate uncertainty

  • Decoder

    The business execs' A-Z guide to technology

  • All Insights

  • Articles

    Expert insights to help your business grow

  • Blogs

    Personal perspectives from ThoughtWorkers around the globe

  • Books

    Explore our extensive library

  • Podcasts

    Captivating conversations on the latest in business and tech

  • Back
  • Close   ✕
  • Go to overview
  • Application process

    What to expect as you interview with us

  • Grads and career changers

    Start your tech career on the right foot

  • Search jobs

    Find open positions in your region

  • Stay connected

    Sign up for our monthly newsletter

  • Back
  • Close   ✕
  • Go to overview
  • Conferences and Events
  • Diversity and Inclusion
  • News
  • Open Source
  • Our Leaders
  • Social Change
Blogs
Select a topic
View all topicsClose
Technology 
Agile Project Management Cloud Continuous Delivery  Data Science & Engineering Defending the Free Internet Evolutionary Architecture Experience Design IoT Languages, Tools & Frameworks Legacy Modernization Machine Learning & Artificial Intelligence Microservices Platforms Security Software Testing Technology Strategy 
Business 
Financial Services Global Health Innovation Retail  Transformation 
Careers 
Career Hacks Diversity & Inclusion Social Change 
Blogs

Topics

Choose a topic
  • Technology
    Technology
  • Technology Overview
  • Agile Project Management
  • Cloud
  • Continuous Delivery
  • Data Science & Engineering
  • Defending the Free Internet
  • Evolutionary Architecture
  • Experience Design
  • IoT
  • Languages, Tools & Frameworks
  • Legacy Modernization
  • Machine Learning & Artificial Intelligence
  • Microservices
  • Platforms
  • Security
  • Software Testing
  • Technology Strategy
  • Business
    Business
  • Business Overview
  • Financial Services
  • Global Health
  • Innovation
  • Retail
  • Transformation
  • Careers
    Careers
  • Careers Overview
  • Career Hacks
  • Diversity & Inclusion
  • Social Change
Continuous Delivery Technology

No more flaky tests on the Go team

Pavan Sudarshan Pavan Sudarshan

Published: Sep 25, 2012

I jokingly say - “If you do not have a flaky functional tests build, you are not doing anything real”. I’ve spent a good amount of my professional career writing a lot of functional tests. I have interacted with a lot of teams in ThoughtWorks trying to understand their functional testing issues. The most common issue in all these teams is the infamous flaky functional test build.

On the Go team, we had a severe issue of a flaky functional test build about 3 years back. It was so bad that we had completely lost trust in our build. A red build did not mean anything any more. Before a release, we would spend about 3 days looking at all the failures and fix them. This completely defeats the purpose of having a CI build that runs these tests. Ideally, what we would want is to run all the tests on every code commit and make sure everything is green.

In every single release the functional tests caught bugs so they were useful. Unfortunately, we were not paying attention to them. We knew we had to fix the flakiness so that we could isolate the real problems, and that the only way we could fix it was by going back to the basics. We brought about some changes in the team:

  1. Stop calling your build flaky - Random Success
  2. Acceptance test builds can never be red – Quarantine
  3. Budget time in the release plan to fix tests – Plan
  4. Refactor to make sure that no duplication is tolerated – Engineer
  5. Understand the nature of flakiness – Learning

1.     Stop calling your build flaky

Have you ever released to production when, say, your search functionality “sometimes works”? Do you think your customer will be happy when she cannot reliably book a movie ticket using your website?  

When you do not tolerate flakiness in production code, how can you tolerate it in functional tests?

The first thing we wanted to address was not the implementation issue at all. Instead, it was fixing the mindset of the people on the ground. After all, it was code that was written by them. How can they not be sure if it was failing for the right reasons or not?

To deal with this, we came up with the concept of Random Success. Most teams say that a test “failed randomly”. We decided that it was the wrong thing to say. If a test ever fails and then passes without any code change, then it “Passed Randomly”. We do not trust such tests. Treat any test, even the ones that are currently passing, flaky if they have failed and then passed without a reason.

2.    Acceptance test builds can never be red

Lets say you have identified a flaky test. In fact, you might have identified, say, 23 of them. You cannot obviously fix all of them. That would take a lot of time.

In our case, we didn’t want to delete them, but at the same time, we did not want to infect our build with the disease of redness. A good middle ground was to Quarantine the identified tests.

Quarantined tests do not get run as a part of your builds. They are there so that they can be fixed later.

Over a period of time, our focus was to mercilessly identify and quarantine flaky tests. Lets say we ended up with just 40% of the original suite. That would be 40% that we could trust. We knew if this failed, we needed to stop the line and fix the issue.

3.    Budget time in the release plan to fix tests

It is completely useless if you have a quarantine that is, say, 6 months old. Just like quarantined patients need extra care, these tests need engineer love. Remember, there was real work put into it. You do not want it going wasted.

So we planned and set expectations with our stakeholders about the capacity that was required to fix our quarantined automation. The key here was that we told them how strategic it was for us to have reliable automation. That was our prerequisite to take our release cycle from months to weeks.

We prioritized such that whatever our QA felt was a must have was not left in quarantine at all. These must have tests, we call “Bread and Butter tests” - these tests make sure that we earn our bread and butter!

Then, we wanted to fix those tests that might not take too long to figure out what was wrong with them. Low hanging fruits, if you want to call them.

4.    Refactor to make sure that no duplication is tolerated

You will be surprised when you see how many functional test issues boil down to bad discipline - random waits, duplication, massive inheritance hierarchies, weird object mothers, fixture objects that are highly complicated to construct etc.

In fact, we found out that a lot of our issues were because of a mix of bugs in our test code and sloppy duplication.

A lot of our bugs were because of the severe asynchronous nature of our application - background processes, batch processing and a lot of ajax. We ended up developing a WaitUtils library to make sure we never have blind sleeps and instead always have targeted waits.

The team started having dev huddles, discussions and design reviews. When a developer was about to write a new page object, she would talk to the others on the team about the usability of the API involved etc.

All of this ensured that we would never repeat the same mistake twice since we followed DRY very strictly.

5.    Understand the nature of flakiness

Most flakiness is due to our own mistakes. Some are easy to find, others are not - just like any bugs you would find in your production code. Think about it - inspite of having a strict QA process for production code, some bugs might be lurking around. Why would there not be bugs in your functional test code?

Though the whole process took about 6 months, over a period of time our team got very good at identifying root causes and fixing them. In the last 2 years, we have had a rock solid acceptance suite. We have automation that takes about 9 hours to run sequentially and yet we know that when it fails, it fails for a reason.

Now, it only takes about 40 minutes to run a 9 hour long test suite! How we do that is a topic for its own blog, I guess.

Further Reading:

* Martin Fowler has a bliki entry on flaky tests

* Jez Humble and Badri did a talk on creating maintainable acceptance tests at Agile 2012: slides | video

  • What we do
  • Who we work with
  • Insights
  • Careers
  • About
  • Contact

WeChat

×
QR code to ThoughtWorks China WeChat subscription account

Media and analyst relations | Privacy policy | Modern Slavery statement ThoughtWorks| Accessibility | © 2021 ThoughtWorks, Inc.