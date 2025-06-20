It’s a scenario that sends a chill down the spine of any CTO: a business-critical application is running, but the source code is gone. Perhaps a vendor relationship soured, leaving you with a functional but opaque binary. Or, imagine you do have access to the source code, but it’s so messy that both humans and AI have a hard time understanding and describing what the codebase actually does.

Usually, one of our approaches to accelerating legacy modernization with AI is to use it to accelerate the reverse engineering part first, feed AI with the existing code and then let it help us create a comprehensive description of the application’s functionality, which can be used in the forward engineering.

But what if we don’t have the code, or it’s so messy that it’s useless? How can Generative AI accelerate reverse engineering in this case?

Recently, we’ve been trying to explore this at Thoughtworks: the result was an experiment in "blackbox reverse engineering." We set out to find out if, by combining AI-driven browsing with data capture techniques, we could create a rich, functional specification of a legacy system and use it to build a modern replacement from a clean slate.

This is the story of how we did it, the hurdles we faced and the powerful lessons we learned about the future of legacy modernization.