Enable javascript in your browser for better experience. Need to know to enable it? Go here.

Reflections on the Future of Software Engineering Retreat

The last twelve months have seen dramatic shifts in software development as a discipline and as a job. While it remains unclear where many of these changes will lead, it’s important we reflect on where we are and where we might be going, because as technology leaders we still need to have a strategy across people, process and talent that will set us up for success today and tomorrow.

 

You will often hear me say that plans are for when you know what you need to do and what the result will be; a strategy, meanwhile, needs to be adaptive to the environment; it’s only when you can prove your hypothesis can you have a long term plan. Unfortunately with where we are, there are no long term plans, only strategies with hypotheses to help us learn and drive towards success in the face of uncertainty. 

 

In this environment, I really value being able to talk to people I trust to hypothesize ideas and potential solutions, especially if I know they’re facing similar challenges and asking the same kinds of questions. At the start of February I met with industry leaders and Thoughtworks colleagues in Utah to do just that. It was a provocative and inspiring couple of days; as an unconference, the format was an acknowledgement that there were no clear answers at present and meant everyone could learn from one another.

 

Below are what I see are the key takeaways from the event. To keep focus, I’ve used the classic simplification of people process and technology. However, it’s worth noting my own model for thinking about AI adoption at Thoughtworks and for our clients goes beyond this to include core differentiation, the services and products you provide to customers and the ability to sense and respond to market conditions. (I’ll try and discuss this in more depth in a future piece.)

 

I won’t cover every discussion that was held as I, unfortunately, couldn’t be in every room. However, what follows provides an overview of what really resonated with me.

People

 

We need to address cognitive load

 

One of the most important insights to emerge from discussions counters assumptions some may have about the use of AI: the fact it’s increasing cognitive load rather than reducing it. 

 

This is partly a question about developer experience. While we’ve typically regarded developer experience and productivity as directly correlated, many of the current AI coding assistants are forcing us to rethink that assumption. Developers using AI tools are finding that while they’re capable of producing a lot more, faster, they’re becoming fatigued and even burned out with these new workflows and modes of interaction.

 

This requires serious attention. As Margaret Storey, Professor of Computer Science at the University of Victoria in Canada, wrote recently, developers “need to recognize that velocity without understanding is not sustainable” and so “should establish cognitive debt mitigation strategies.” This might include things like regular human checkpoints and proven methods of sharing knowledge.

 

At a fundamental level, we need to both evolve our understanding of what engineering work actually entails and rethink how we measure value. Developers could never consistently code for eight solid hours a day without getting exhausted: even if AI takes over code generation, the move to working on a handful of concurrent problems with AI and the necessary context that requires isn't sustainable or conducive to good work. It’s not really about just writing code; frequent design and architecture work can lead to significant decision fatigue.

 

I’ve certainly experienced this as a leader; as things move faster, decision fatigue becomes a greater issue. I find I no longer have the time to do that all-important noodling on a problem that can’t be replaced by AI, unless I intentionally make that time. Will the same be true for our software engineers in future?

 

The staff engineer role is changing

 

The consequences are that the staff engineer role will change in the next year. At the retreat we talked about greater focus on the ‘middle loop’ of supervisory work. This is situated somewhere between writing code and release management, with its objectives being agent orchestration and governance.

 

This will no doubt be interesting work. However, for software professionals that enjoy working closely with code, the shift could be challenging. This is partly because of the inevitable cognitive load and decision fatigue, but it’s also simply because many developers really like writing code. How we support our colleagues through that transition and how their skills may be redeployed is an open question we will need to attend to this year.

 

It’s not just the staff engineer role that the group believes will change. It’s likely that the way we define the relationship between product management and software engineering will need to be renegotiated. How exactly this plays out remains to be seen. 

 

The response to these issues of organizational design will likely lie in how we decide to construct our workflows in the future: where do we need the human in the loop? What decisions require human oversight? Where do agents or swarms of agents excel?

Rachel Laycock, Thoughtworks

Process

 

Testing and code reviews

 

One of the most interesting conversations I was part of centered on code reviews and testing. Specifically, the group discussed what happens to code reviews in a world where a huge amount of code is generated by AI.

 

A key insight, and one I’ve been thinking about for a while, is the possibility of tiering software in terms of importance and criticality. In other words, we may be able to circumvent code reviews where code quality is deemed of less importance, and instead focus our attention on areas where it really matters. 

 

This has the potential to be a powerful idea; it also fits nicely with the issues we discussed around cognitive debt: focusing our review efforts where it’s most crucial ensures knowledge can still be effectively shared within teams while also limiting the amount of complexity we’re contending with. In turn, this will allow us to be more intentional about where the human is active in a workflow and what expertise is required in which loop. A few attendees mentioned they were already doing the work of categorizing risk, suggesting this is very much an important avenue of exploration.

Of course, if we move code reviews elsewhere there is still the question of what comes in its place, especially when handling significant changes. Part of this is about how we can ensure knowledge sharing across a team or even how we mentor junior or new team members. It’s also, though, about maintainability and architecture decision-making; while the code review has played a key role in those things in the past, we may need to (re)turn to pairing and trunk-based development. Of critical importance will be strong guardrails in the CI/CD pipeline, which will enable strong functional and cross-functional testing.

 

The key question for me is identifying precisely what we care about as software engineers. Once we do that we can then identify the best mechanisms for putting that care into action.

 

Agent topologies

 

One of the concepts to emerge from discussions was ‘agent topologies’. Reflecting on the concept of team topologies, a framework for optimizing flows of value through an organization’s structure, the group considered the impact of agents on organizational structures and how component parts might work together. 

 

These are, in part, governance questions, but they’re also questions about organizational design. Agents need to aid collaboration and flows rather than disrupt or complicate them. That’s something leaders and technologists will need to take responsibility for. It’s also something I will be intentionally exploring and watching closely as we see the impacts of agents at scale in organizations.

Technology

 

Programming languages 

 

What exactly does AI mean for programming languages? The conversations on this largely focused on how the features of given programming languages may prove to be helpful in constraining code that’s generated by AI. In practice, then, this means terse and safe languages could prove particularly valuable and apt for this moment.

 

There was even discussion about the possibility of completely new languages appearing that are more appropriate for agents. This led to debate about whether such languages would even need to be human readable if we can now use AI to help us understand any code in any language. We also circled around how such code might be tested, discussing formal methods and property-based testing. Such approaches focus much more on verification than the specifics of the code, which aligns with the direction of us caring less about the code and more about agent behavior and outcomes. They would allow us to check agents did the right thing in the right way by projecting a representation of the code that could be understood by humans (regardless of programming language).

 

Perhaps the most significant point was the possibility that source code would ultimately disappear from view. If and when that might happen was, however, an open question.

 

Self-healing systems

 

Finally, I want to flag the discussions we had around the potential of self-healing systems, a topic I'm really interested in and excited about.

 

We first distinguished between self-healing and self-improving systems. Self-improving, we agreed, is about systems that can autonomously optimize themselves in terms of performance or security. Here, we discussed the possibility of using agent swarms as chaos monkeys to stress-test and push a system to find new ways to form in unfamiliar or challenging conditions. 

 

Self-healing, meanwhile, the ability of systems to return to a given state led to explore how production issues are handled in an age of agents. Most agreed that in this context changing code should always be a last resort.

 

One of the most interesting and perhaps immediately applicable ideas was the concept of an ‘agent subconscious’, in which agents are informed by a comprehensive knowledge graph of post mortems and incident data. This particularly excites me because I’ve seen many production issues solved by the latent knowledge of those in leadership positions. The constant challenge comes from what happens when those people aren’t available or involved.

 

Given the dynamism of agentic tools and the ever-increasing complexity of distributed systems, it is also going to be critical to think through how we record changes; in other words, what would a single source of truth look like? 

 

One idea that I like a lot is using agents to pull together a superset ledger of all changes across all systems. If a strong ledger exists for every change in every system from logic, to infrastructure, to database changes and the ability to rollback, we could then use generative AI to aggregate this data into something consumable for both agents and humans.

Some final thoughts 

 

The quality of the conversations and insights at the retreat were outstanding; it was a real privilege to be a part of it. I’m sure that they will continue to generate reflections, experiments and ideas for a long time to come. As I dive deeper into these topics and flesh out hypotheses and conduct real world experiments, I will share more learnings.

 

A starting point is important, but it’s what comes next that really matters.

Explore more from the Future of Software Development Retreat