In recent months, we’ve seen clear evidence that using GenAI to understand legacy codebases can significantly accelerate comprehension of large and complex systems. Tools such as Cursor, Claude Code, Copilot, Windsurf, Aider, Cody, Swimm, Unblocked and PocketFlow-Tutorial-Codebase-Knowledge help developers surface business rules, summarize logic and identify dependencies. Used alongside open frameworks and direct LLM prompting, they dramatically reduce the time needed to understand legacy codebases.
Our experience across multiple clients shows that GenAI-assisted understanding of legacy systems is now a practical default rather than an experiment. Setup effort varies, particularly for advanced approaches such as GraphRAG, and tends to scale with the size and complexity of the codebase being analyzed. Despite this, the impact on productivity is consistent and substantial. GenAI has become an essential part of how we explore and understand legacy systems.
In the past few months, using GenAI to understand legacy codebases has made some real progress. Mainstream tools such as GitHub Copilot are being touted as being able to help modernize legacy codebases. Tools such as Sourcegraph's Cody are making it easier for developers to navigate and understand entire codebases. These tools use a multitude of GenAI techniques to provide contextual help, simplifying work with complex legacy systems. On top of that, specialized frameworks like S3LLM are showing how LLMs can handle large-scale scientific software — such as that written in Fortran or Pascal — bringing GenAI-enhanced understanding to codebases outside of traditional enterprise IT. We think this technique is going to continue to gain traction given the sheer amount of legacy software in the world.
Generative AI (GenAI) and large language models (LLMs) can help developers write and understand code. Help with understanding code is especially useful in the case of legacy codebases with poor, out-of-date or misleading documentation. Since we last wrote about this, techniques and products for using GenAI to understand legacy codebases have further evolved, and we've successfully used some of them in practice, notably to assist reverse engineering efforts for mainframe modernization. A particularly promising technique we've used is a retrieval-augmented generation (RAG) approach where the information retrieval is done on a knowledge graph of the codebase. The knowledge graph can preserve structural information about the codebase beyond what an LLM could derive from the textual code alone. This is particularly helpful in legacy codebases that are less self-descriptive and cohesive. An additional opportunity to improve code understanding is that the graph can be further enriched with existing and AI-generated documentation, external dependencies, business domain knowledge or whatever else is available that can make the AI's job easier.
Generative AI (GenAI) and large language models (LLMs) can help developers both write and understand code. In practical application, this is so far mostly limited to smaller code snippets, but more products and technology developments are emerging for using GenAI to understand legacy codebases. This is particularly useful in the case of legacy codebases that aren’t well-documented or where the documentation is outdated or misleading. For example, Driver AI or bloop use RAG approaches that combine language intelligence and code search with LLMs to help users find their way around a codebase. Emerging models with larger and larger context windows will also help to make these techniques more viable for sizable codebases. Another promising application of GenAI for legacy code is in the space of mainframe modernization, where bottlenecks often form around reverse engineers who need to understand the existing codebase and turn that understanding into requirements for the modernization project. Using GenAI to assist those reverse engineers can help them get their work done faster.