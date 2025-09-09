As software-defined vehicles have risen in prominence across the automotive market, the industry is having to rethink its approach to architecture. We’re starting to see a shift from electrical design, involving many distributed electronic control units (ECU), to what’s called a zonal architecture.

To be sure, this change has many benefits: improved data flow, greater efficiency in software updates and over-the-air (OTA) functionality, to name just a few. However, it also requires significant codebase modernization. This is particularly challenging because a significant amount of code on which vehicle software runs is C++ — a language notorious for being hard to read.

That’s not the only challenge, though. Modernizing C++ in this context doesn’t just involve refactoring and optimizing existing code; it also needs to be integrated with new software layers to ensure compatibility with modern hardware and asynchronous communication protocols, such as CAN or SOME/IP.

Getting this right is crucial for the automotive industry. In this blog post I’ll explore how generative AI tools may be able to help us modernize tricky C++ code quickly. Such tools might just be critical if automotive manufacturers are to fully realize the potential of the SDV fast and remain competitive in a rapidly evolving landscape.

Modernizing C++ using an AI modernization tool

At Thoughtworks, we’ve developed a generative AI tool specifically designed to help teams modernize legacy code: CodeConcise. CodeConcise was initially developed to accelerate the reverse engineering of large COBOL codebases. However, it’s extensible and can be adapted to many other programming languages — including C++. We wanted to explore how we might use CodeConcise in an automotive and SDV context in the hope it would simplify and accelerate the modernization process. We started our experimentation onAutoware Universe, an open source software stack for self-driving vehicles developed by the Autoware Foundation, built on the Robot Operating System 2 (ROS 2).

How CodeConcise works

Before we go further, it’s worth diving into how CodeConsise actually works. It consists of two parts: the analysis of the code it’s given and the UI which includes, among other things, a chat interface. So, before we can interact with the codebase using a chatbot or look at the capabilities that have been extracted by the LLMwe need to make it aware of our specific codebase.

CodeConcise makes use of RAG backed by a knowledge graph which is injected into a Neo4j database. This ingestion is the first phase:the code is inserted into the knowledge graph in a reasonable manner so in a later phase the LLM will later be able to understand the various parts of the codebase and how they fit together.

We saw that other language integrations made use of static code parsing to extract and ingest chunks. We started breaking down the code into classes, later in methods, then extracting namespaces. Eventually, we moved away from static parsing to CLANG (using LLVM libclang python bindings), a compiler frontend. This helped address the complexity of C++ and allowed us to implement various relationships between logical code chunks.

Once we did this we weren’t limited to only parsing the code; we also could make use of the syntax tree and insert parts of it into our graph.



The image below shows what the graph could look like after the ingestion:

