Example #1: The curious case of too many defectsThis team claimed to be following Scrum but were struggling with the issue of lots of defects surfacing at a late stage of the release. Closer scrutiny revealed several problems, one of which was the way a story was accepted to be dev-complete. A developer would simply claim a story as dev-complete and move on to the next story without any verification by an analyst. When a tester took the story up for testing a couple of days later, she would find it incomplete or buggy and start raising defects in the defect tracking system.
I explained the value of fast feedback and recommended that they adopt the practice of dev-box testing. Basically, once a developer (or dev pair) feels she is done with a story, she can’t simply move on to the next one. She has to convince an analyst (BA or QA or both) that it is done. This is achieved by means of a ten to fifteen minute pair-testing session on the developer’s computer (hence the name “dev-box testing”). Only the main scenarios are exercised not every boundary condition or nook and cranny. Any shortcomings aren’t officially recorded as defects at this stage, the developer just notes them on a sticky note or something and starts fixing them immediately. The story is accepted as dev-complete only after the analyst is satisfied with dev-box testing. This eliminates a lot of back-and-forth that would otherwise result between developers and testers.
I thought dev-box testing was low hanging fruit and expected it to be a easy win with the client. However, I had just opened a can of organizational worms:
The testers were worried they would lose track of the defects found during dev-box testing if they didn’t record them in the defect tracking system. I said that this wasn’t a cause for concern if they were doing TDD right. For every issue raised during dev-box testing, developers are expected to first write a failing unit test and then write the code to fix it. The tests would then form part of a regression suite. However, it seemed like they didn’t follow TDD with this level of rigor. Besides, they were used to achieving traceability through devices external to the codebase (defect tracking systems, test case documents) and they were uncomfortable with the idea of code-based traceability.
I argued that in the spirit of “working software over comprehensive documentation”, they should be more eager to find and fix defects as soon as possible rather than documenting them. However, they had been long indoctrinated by a “process quality” group (legacy of CMM etc.) about the need for evidence of process compliance. Recorded defects with fields like “phase in which detected” and “phase in which introduced” provided excellent evidence. “Dev-box testing” would be viewed as non-compliant by the auditors from the process quality group. I would have to ask for an increase in the scope of my engagement to influence the process quality group to change its ways.
Org structure and metrics
It turned out that the developers in the team belonged to the engineering organization while the testers belonged to the QA organization. The dreaded IT matrix! What’s worse, the QA org had a KPI on the lines of number of defects reported per tester per week. No wonder, the testers weren’t too keen about dev-box testing. Again, I would have to ask for an increase in the scope of my engagement to influence the governance team on how they designed KPIs and there was the question of who would pay for the increased scope.
Example #2: Green dashboards and red-faced usersThis was at an insurance company. A development team had just completed another release of a claims processing application. The operations team was now in charge of keeping it up and running. However, the number of users (claim adjudicators) was far in excess of the load that the application was tested against. It kept freezing and crashing. Operations used to execute manual restarts when a user raised a ticket. However, they lost their in-progress work. It was escalated to the operations team manager who had no jurisdiction outside her team. So she got her team to introduce automated poll-and-restart and thereby maintain availability (by their definition) at 99%. It didn’t help the users much as they still lost work-in-progress. The real problem was a lack of IT accountability for the greater outcome which was effective processing of claims. Development and operations were just two different activity-oriented teams.
Takeaway: In this case again, the organization’s structure (separate dev and ops teams) and its lack of outcome-orientation present organizational barriers to agility.
What it takesContinuous delivery needs fairly self-sufficient, outcome-oriented, cross-functional teams. However, when Enterprise IT looks at staffing from the perspective of maximizing utilization, it fears that cross-functional teams will lead to underutilization of specialists. This is false economy when optimizing for overall responsiveness but nevertheless, the mindset persists.
Besides, it is harder to staff cross-functional teams for typical project durations. We need to move away from the projects model of IT execution for this and other reasons. In any case, the project as a vehicle of IT execution has by and large, failed to live up to its promise of predictable delivery. It is better to chase value than chase predictability i.e. value/outcome-orientation over plan conformance. Note that this is only a org level formulation of the good old “responding to change over following a plan”.
Moving away from projects in turn demands a change in the way Enterprise IT funding works. If funds can only be sanctioned against detailed plans, then we can’t help being plan-driven. Finally, in order to let go of our obsession with predictability, we’ll have to reconsider our mental model of software development and come to terms with it as a design process rather than a production process.
The last three paragraphs are big topics in themselves and therefore I have devoted an entire book to them. It is called Agile IT Organization Design and it explores the subject of this article along with practical suggestions to rework your IT for scalable agility and digital success.