Bringing generative AI to bear on legacy modernization in insurance

Erik Dörnenburg on cutting edge techniques to accelerate transformation

Erik Dörnenburg and

Richard Gall

Published: October 25, 2024

Legacy modernization is challenging in many different industries. But in sectors such as insurance, where core systems are often decades old and organizations face particular regulatory pressures, the challenge is especially great.

However, perhaps somewhat surprisingly, generative AI can be very helpful. Although it’s generally known for its creative capabilities, as a way of generating new text or visual content, Thoughtworks Europe CTO Erik Dörnenburg explained to Jonas Piela on a recent episode of the German Digital Insurance podcast how it can be brought to bear on legacy modernization in the insurance industry.

For a broader exploration of using generative AI in legacy modernization, listen to Erik talk on Thoughtworks’ Pragmatism in Practice podcast.

Modernization challenges in insurance

“What we see in particular is that… the systems that are still not modernized today are built with technologies for which it is becoming difficult to find people who know them.” In other words, in sectors like insurance where systems are particularly dated and built using technologies that even experienced software professionals know very little about, there’s an increasing knowledge gap. This means that, ironically, as updating these dated systems gets more important, it becomes harder and harder to do so.

“We often have documents that are decades old and it is no longer clear to anyone whether the documents actually correspond to the code in production, or there are no documents at all that describe it. Or even worse: the documents contradict each other, the documents say something that you can see from a quick look at the code: that can't be the case at all.”

It’s not just a question of knowledge, though — the sheer scale of these systems only adds to the complexity of what teams attempting modernization must attend to.

“A million lines of code, if I were to print them out, would be a stack of paper 1.50 meters high… I can't just look through it.” Dörnenburg says. “it's not sorted alphabetically like in a dictionary — the things that belong together are not always together in the code.”

“It’s a bit like finding the needle in the haystack,’” he continues. “When they say there's an error or there's a special case for calculating this policy, this special case may be somewhere on meters of printed paper.”

How generative AI can help — with retrieval augmented generation

These challenges — sorting through vast and complex amounts of code or documentation to understand a legacy system — is where generative AI can help. However, it isn’t simply a case of feeding your codebase to a large language model and then expecting it to deliver answers to questions and prompts.

Instead, generative AI needs to be augmented with techniques that can provide more contextual information and detail about a given system. One of these techniques that Thoughtworks has found to be particularly useful is graph-based retrieval-augmented generation.

Dörnenburg explains why: “In large language models, there is a lot of talk about training. You talk about all the training data that comes in and then the model can play back what it has learned over and over again, but all the data has to be in the model beforehand.”

The challenge is threefold: “training these models is incredibly expensive, you need incredibly large amounts of data, and yet it often leads to so-called hallucinations, where the model presents things as facts that are simply factually incorrect.

“What retrieval augmented generation does is the following: instead of just giving the question... to the model, the amount of data that can be passed on as a prompt is filled with information that comes from other sources.

“You don't use the models to retrieve the relevant documents at the beginning, but rather use classic technologies that also have a direct understanding of COBOL program code or Java program code, for example, in order to find… additional relevant information. This all happens beforehand — I combine it into a large prompt and then I send it to a standard GPT model as normal.”

Dörnenburg emphasizes that training your own models is rarely a good idea. It “doesn't make economic sense and the result is not justified,” he says.

More context isn’t always better

It can be tempting to assume that the way to improve the prompt for large language models is to simply add more context. The more data and information we have, the better the results will be.

However, Dörnenburg points out that this isn’t actually true. “It actually looks like ‘more is better’ is no longer the case at this point… it really depends on how I select the documents beforehand.”

So when it comes to legacy modernization, it’s about pre-constructing context using technologies other than AI — such as knowledge graphs — that can ensure the best results and the most accuracy in terms of helping you to understand a legacy codebase.

“We… prepare all of this code analysis in advance. That means we create what are known as knowledge graphs from it — not graphs like those you know from insurance, with X and Y axes, but the graphs that we have in computer science, with edges and nodes that are connected to each other. Like a social network.”

He explains it further: “the program code is in different files, or has different constructs, that these files and constructs are the nodes, and the edges between the nodes say: this part implements this or depends on this or provides basic performance for it and so on — and prepare it in this way. That is what I would say is state of the art… preparing the code so that you can use it before you start using a large language model or a GPT model.”

[Generative AI] makes modernization faster. It does not 'do' modernization at all — it simply helps the experts to be able to carry out modernization more efficiently and effectively.

Erik Dörnenburg

CTO, Thoughtworks Europe

[Generative AI] makes modernization faster. It does not 'do' modernization at all — it simply helps the experts to be able to carry out modernization more efficiently and effectively.

Erik Dörnenburg

CTO, Thoughtworks Europe

The need for human oversight and judgment

It’s important to note that the idea of bringing generative AI to bear on legacy modernization certainly doesn’t mean automatic modernization.

“It is not a simple solution where you say that someone who has no idea about the topic can now use LLMs to convert mainframe software into other software. It is a tool for specialists,” Dörnenburg points out. “We do not say that it ‘does modernization’. Instead, we always say… it’s modernization acceleration. It makes modernization faster. It does not do modernization at all, it simply helps the experts to be able to carry out modernization more efficiently and effectively.”

The implication here is that professionals still need to remain at the heart of the process. Indeed, ensuring there is still human skill, judgment and knowledge in the legacy modernization process also means that we don’t need to ask the AI technology to be perfect every time (which it never will be).

“The reason why the use of generative AI is so successful in legacy modernization is because I am confronting modern software developers with the old technology, but what the LLM spits out… is not program code that has to be perfect. They just have to spit out enough information for the programmers to understand what the old code does, so that they can then program it with new code.”

The future is smaller, not bigger

When looking ahead to the future, it’s possible that the direction of travel is smaller, not bigger. Dörnenburg notes that the emergence of small language models could come to shape the way we think about building and running AI models in the years to come.

Small language models are essentially AI models that don’t need to be run in the cloud with huge amounts of computing power. It can be run on what’s called the edge — in other words, individual devices like phones or laptops situated at the ‘edge’ of a network. The benefit of this is that it not only requires less computational resources, it also means data can be used without having to be moved to a central location. That’s good news from a security and privacy perspective — which will be particularly beneficial for a highly regulated industry like insurance.

“Small language models are now making their way into the consumer sector. Apple has done a lot… saying that models should run on iPhones. Microsoft and Google are doing the same.” Dörnenburg says. However, the applications can extend beyond consumer tech. “We can also transfer that and say we're making a small language model that might not run on an iPhone, but on a large cloud server that I can rent myself..”

Although the use of generative AI in legacy modernization is only in its early stages, when used intelligently, with real sensitivity to what the technology can and cannot do, its impact can be huge. For the insurance industry — which has long struggled to take advantage of innovation — it could be a real game-changer.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.