Applying software engineering practices to data science

Podcast host Zhamak Dehghani and Mike Mason | Podcast guest David Colls and Danilo Sato

September 17, 2020 | 1 hr 5 min 56 sec

Read transcript

Listen on these platforms

Brief summary

The need for high quality information at speed has never been greater thanks to competition and the impact of the global pandemic. Here, our podcast team explores how data science is helping the enterprise respond: What new tools and techniques show promise? When does bias become a problem in data sets? What can DevOps teach data scientists about how to work?

Podcast transcript

Zhamak Dehghani:

Hello, everyone. Welcome to another episode of Thoughtworks Technology Podcast. I'm Zhamak, one of your regular hosts, and joining me here today are Mike, my other host. Hi Mike.

Mike Mason:

Hello, Zhamak. Hi.

Zhamak Dehghani:

And we have two of our awesome colleagues who work extensively in the field of data and ML, Danilo and David. Welcome.

Danilo Sato:

So I'm Danilo Sato and I've been with Thoughtworks for 12 and a half years, and I've done a lot of things in the company from software engineering, infrastructure, and dev ops. But in the last five or six years, I got involved with leading our global data community and growing our capabilities around data engineering and data science. So that's kind of my field of interest and that's kind of bringing all my background into that space. And I'm in London. It's about 9:20 PM now.

David Colls:

Great. I'm David Colls and I'm based in Melbourne Australia where I lead the Thoughtworks Australia data service line. It's about 6:00 in the morning here. Unfortunately, we are about 24 hours off being back in lockdown as a result of COVID-19 cases in Melbourne. So I'm looking forward to another six weeks of working from home now, but I'd count myself pretty lucky in the scheme of things. My time at Thoughtworks, I've been here about nine years and my background prior to Thoughtworks was in engineering and scientific software for simulations. But over time at Thoughtworks, I've been involved in everything from organizational transformation to prototyping and smart TV applications. In time, I find myself leading the data service line with that combination of skills and experience. And so we do data engineering work and ML product development work here in the data service line.

Zhamak Dehghani:

Thank you for waking up so early so we could catch Danilo before he goes to bed.

Mike Mason:

And the nice thing, I know folks listening in won't help them very much, but we can all see each other on video because we're all on video as we record this, and it looks like David's had a haircut fairly recently. So he's not going to get the COVID lockdown the long hair too soon.

David Colls:

I've also been having my haircuts at home, Mike.

Mike Mason:

Oh, you have? All right.

David Coles:

I'm glad it looks good on the Zoom call.

Mike Mason:

Oh, there you go.

Zhamak Dehghani:

So maybe to kickoff things, we had a theme on our Technology Radar that we published just late May called ‘data perspectives maturing and expanding’. And often themes are representative of the conversations we have and the numerous blips that end up on the Tech Radar when we regenerate that particular publication. And this particular theme was really a catch-all theme for a lot of blips and discussions around ever growing and maturing data engineering tools, and techniques, and architecture to machine learning, new ways of training models, full life cycle management of machine learning. So it was quite a wide spectrum of blips that failing under this theme. So we thought we'd bring you today to share your perspective on some of these blips and share some examples of where you see them in action. So it's going be a conversation that goes to a lot of different, I guess, diverse topics under this umbrella.

And I think this is particularly important because as we're going through this very special time through crisis where COVID crisis, the data has become such an integral part of solutions where every norm just doesn't make sense anymore, we can’t trust our guts. So I think this topic is most also relevant to what's happening right now in the world and the crisis that we see with the COVID that businesses are struggling to predict what would be next, and trusting gut is not really an applied method anymore and we have to use data more and more, as well as looking at the injustice that has happened and how can we use data in a just fashion, and how can we remove injustice, I guess, and bias from our data. So I think the context and the time is very relevant to have this conversation.

David Colls:

I think we've been seeing a whole range of changes with our clients in response to different patterns of customer behavior and societal freedoms or restrictions as it might be, and also the demand for services and goods as a result of people's changing circumstances. We've seen everything from the total collapse of demand for airline travel, for instance, and that promotes one kind of data driven response, all the way through to sustained surges in demand for companies that already supply goods or services remotely. For instance, a Zoom video conferencing.

Danilo Sato:

We've seen some of the government work we've done in the UK as well. Some of the departments, they get a spike in demand and there was a big demand for knowing more about what's happening, and how they should respond, and how they should allocate their staff to deal with the crisis. And it was something that our team that were working in the data space there helped them basically bring up that, give them more real time information about what's happening so they can make more informed and more timely decisions.

Zhamak Dehghani:

I agree. I think the real timeless is an important one. What we've noticed as well with a lot of executives, the dashboards that were driving the business and business decisions that made sense to be monthly, or quarterly, or weekly, now need to be daily and near real time. So, that's certainly a change. And I think in the healthcare space, that's just an obvious space that data is being used in so many different ways from conversations, or the rapid response through telemedicine and virtual care that generates a ton of information that then you can post process to just using it for population analysis and see how the virus is spreading. I geek out on the nightly COVID data for the county that I live in. And there is a ton of information that is being generated to use to make everyday life decisions. Send your child to preschool or not, just as one.

Danilo Sato:

I think it also put a spotlight on how to think about data and importance of data quality and the meaning with the data. People became more aware of what does this actually mean? Is it a total number of cases? How often is it updated? It brought some more real world understanding, I think, for everyone, not just us in that tech space, but everyone to be more aware of the importance of having that data and really understanding what it means.

David Colls:

And things like leading indicators and lagging indicators.

Danilo Sato:

Correct.

David Colls:

And how to manage in a situation where it might take two weeks before you see the impact of a change, to primary health perspective, but I think as you've commented, the fact that things have changed so quickly has been a big impact wherever you are on the spectrum. We've had some of our clients where they've had more call center traffic in the first month after lockdown started than they had in the entire of 2019. And so then a data challenge actually becomes a people challenge as well. How do you redeploy your people internally to best slate that shifting pattern of demand? And a lot of that's high touch interaction, novel scenarios. It's not established transactions that are efficient to process.

It's actually a whole series of customized services that you need people to deliver. And then the focus on data solutions becomes more about augmenting people and enabling them to do more high value tasks more efficiently and improve the productivity of high value tasks rather than totally automating low value tasks. And then that interaction of people and data solutions means you need to be able to iterate very rapidly to get insight and work effectively.

Zhamak Dehghani:

That's a great segue, I think, to talk about CD4ML, talking about rapid changes and iterative changes and be able to put out an end to end solution and change it when that solution depends on data, depends on the ML. It's an interesting topic. So we had CD4ML as a blip for quite a while, I think, on the radar for about a year, I think, and this time around, we put it in the trial ring, which for us is quite as strong embracement as basically the thing to do. So I was wondering if you can unpack that for the audience, what it is, why we use it, where we use it, and give us some examples.

Danilo Sato:

Sure. I'll give it a try. First of all, why do we care about that? There is a lot of clients we go or industry problems where we see people trying to build this data solutions or train machine learning models. And sometimes it works in the data scientist machine, or they have a lab environment where they manage to get a hold of some kind of subset of the data, they train something. But then it's really hard to get that into operations and put it in production. And CD4ML is our approach to tackle that problem. Continuous delivery is something that we at Thoughtworks, we've been doing that for a long time, even before applying this to the specific type of systems, but the idea of building systems, software systems, one, that we can reliably release with high quality and at pace consistently, but also all the automation and the supporting processes and the tools to enable that process to run smoothly is something that we've been doing.

And I think when we think about CD4ML, when we're building an ML system, it's not just about the codes that you're trying to push, but now we've got to deal with these changes in the data, and the data sets, and also how we manage all the models and the training process that we're doing. So it adds a lot of complexity to the life cycle of that process. But when we think about continuous delivery, a lot of the principles that we apply to software, they still apply for machine learning systems. We want to release things frequently. We want to work in small batches as we can, we're not going to call something done until it's released, we want to try to version control everything, we're going to build quality into the product that we're building. So all those principles are still applicable.

What changes is because there's more moving pieces and the nature of the thing that we're tracking is also slightly different, so we need to change a little bit the process and the approach that we take. But continuous delivery for machine learning is basically trying to bring those principles to these type of systems and bring that discipline into that as well when we're thinking about that data moving around, the data being available, and then the training process for this machine learning models, how does those models get promoted, the process you put around them to ensure the quality is there in the model, accuracy, and things like that. And then also the actual process of releasing them and how they promote them to production. Do you replace the existing one? Do you compare with existing models? So that whole discipline around doing that in a reliable way, the quality and the automation is what it's all about.

Mike Mason:

Because it does seem like even something like version control and repeatability, that's something we've been doing in software for a long time. And even with software builds, you need to take some care to make sure that you're using the same libraries each time that you build the software. Otherwise, you end up with something that's not repeatable. So in this case, we're talking also about tracking the data sets that we're using for training models. That seems hard even though, right? Aren't we talking about gigantic data sets and do you just check it into GitHub, or what happens there?

Danilo Sato:

Potentially. We're not checking into GitHub, but this is one good problem. We're not going to be able to use the same tools as we're using to track code, but there are lots of tools about how do we version data sets or how do we track changes in the data in a way that it's actually implementable, that's not going to overwhelm the tool. So there's a new set of tools, lots of new tools coming out to try to solve those problems because it's not just a data set. I think the data is one aspect, but then also, what is the machine learning training process on top of that data? Because that training process sometimes, data scientists, they might want to split that data.

They might want to validate against different data sets. They might be actually creating more data because they engineer new features on top of that. They want to improve the accuracy of the model and they can generate new features. Now they might be, one, adding more data, but they might be writing some code as well. When you're using the model, you want to make sure that you calculate the feature the same way that the data scientist envisioned when they were training it. So there's a lot of this kind of complications that get bundled up into it that makes it complicated.

Mike Mason:

And forgive me for a very naive question, but if you train a model the same way twice, do you get a binary, identical model at the end of it that you could compare and say, "Yes, I built that correctly because it was the same as the last time?"

David Colls:

Yeah, if you track all the inputs.

Mike Mason:

It's a lot of inputs thought, right?

David Colls:

Yes. It's a lot of inputs.

Danilo Sato:

You need to track all the inputs. There's also some of the things is non-deterministic in the model training process itself. So you might have things that, for instance, random number generator seats. Maybe you train with the same model, the same input, but because the random number generator is in a different state, you would give it a different model, even if it wasn't the same input. If we can, try to make it reproducible to that point where we can compare input, output and have a stable thing that doesn't change, then we try to do that. But when we think about quality and testing for this kind of systems, there's other ways to assess quality of the model that's not just making sure it's given the same input, it will give you the same output.

Zhamak Dehghani:

I think that's an interesting one because why we adapt the principles of continuous delivery now to a completely new realm and those principles are applicable, the techniques and approaches and tooling would change. Which you just mentioned, the non-discrimination nature of the models would actually require different kind of testing, right? So I would assume now you're using a different kind of statistical-based testing rather than logical given this input, I'm expecting this behavior and this output. So that might change as well. Would you talk to that a little bit?

David Colls:

Again, that's actually another really big driver for CD4ML as well in that how do we determine that a solution is good enough for a particular purpose? How do we actually answer that question? As Danilo talked about, the different modes of deploying. The champion challenger, for instance, if we have an incumbent model, how do we know that a new model that we've trained is better than that incumbent, is one way of looking at it. And another way of looking at it is are we ready to put this model in front of the customers and what threshold do we have to meet before we're ready to put this model in front of customers?

And often that's a business discussion primarily rather than a machine learning discussion, but it's actually machine learning becomes a forcing function for the business to define what good looks like in a particular scenario because the machine learning model isn't going to be able to improve unless it has a clear signal for what good looks like and what better means in a particular scenario. And so CD4ML like CD for software pushes that quality and that definition of done conversation to the left. And so CD4ML, we might start by establishing a pipeline for the simplest model, which fraud is a typical example.

David Colls:

The simplest model might predict that every transaction is not fraud, but can we actually put all of the framework in place to be able to deploy that model into some environment where we can assess whether it's performance is good enough or better than another model.

And then can we iterate on the inputs on the hyper parameters, on the feature engineering, for training or for inference? All of those sources of change, can we iterate on those until we get to a model that is better than the incumbent and good enough to put in front of customers?

Zhamak Dehghani:

It seems like it can be. It is such an interesting space to be in because you're working at the intersection of lots of disciplines that we had before. Like what you just described, David, it reminded me of TDD, test driven development. That you just write your test, the simplest test, and then write a dummy implementation even, for the body of that function. And then over time, it's ready to operate and get it to pass the test. Right? So CD4ML seems like bringing all of those practices, but then changing it to apply to machine learning, which has a different nature.

Danilo Sato:

Yeah. There's an aspect of quality as well, around assessing if there's any bias that was introduced while you were training the model. Right? So one thing I like to tell people is, "The quality of the model is only as good as the quality of the data that you're using to train it." So if the model is coming up with bad or crappy answers, it's probably don't have enough data, or the data is not good enough for the training.

And if you don't have the good data, if the bias is already the data, the model is going to be really good at picking that up and using that to make a decision. Right? So there's a lot of tools also coming up in that space to try to help us understand what the model is picking up as important features or things that are in the datasets that are making the model lean towards a decision or another.

And also like to try to maybe assess the model against different slices of the data. So if you're worried about like some type of bias, let's say like racial bias, or gender bias, or things like that, we can use some of that to say, "Okay, let me assess the model against this subset of the data slice. And how does it perform in that subsets? Or is really good in this subset, but it performs really badly in the other one."

So you need to have all these multiple lens to assess the quality before you're happy to put it in front of customers and move to production. And as David said, like a lot of these decisions might not be the data scientists or the engineering team to make. It's really about like getting the business on board with understanding what is happening, understanding what's driving the changes. And also, during this promotion of process to get to production.

Mike Mason:

I saw a paper in the past week that said, I'm not going to use all the right words here, because this is not really my space. But essentially the paper said, if you use a machine learning model to create a strategy, if there is an unethical but efficient strategy, the machine learning model will find it and exploit it. And the way I interpret that is that unless you put some guardrails around use the machine learning in any context, it is going to find unethical, possibly biased ways of exploiting patterns in the data.

Danilo Sato:

Absolutely. And it's not doing that on purpose. It's more like it's trying to learn what are the things... It's trying to find patterns, right? Like if something is in the data, it's easier to identify. You will pick that up quicker and you will amplify it.

David Colls:

And yeah, you can quantify that in different ways in terms of the models output. But you can also, as we had a client recently, you can also have conversations about what data is fair to use in the model. So, what data should the model be able to see to make a decision? And in this instance we were asking, "Well, what extent of historical data should influence a decision in the present?" And that's typically a lot of fairness and ethical questions come down to that. Given that the data that models are trained on has a bias in terms of what was collected, in terms of what decisions were made by people prior to attempting to use a trained model, to make those decisions.

There's inherent bias all the way through that process. Should we need to make some active decisions about what to include and exclude in terms of the data that the model sees. But then we can also look at its decisions. And often I think one of the other things that we find with a CD4ML mindset is that typically looking at machine learning, it's framed as how often do we get the right answer?

And so we might quote accuracy as a measure of model performance. But when it comes to a business scenario, we often care more about what is the cost of the different types of mistakes we might make? And so they might be highly asymmetric costs for making different types of mistakes. And those, it can be hard for, let's say a data science team working in isolation, to accurately assign values to those costs.

They actually come down to a business decision and a need for actually a diverse team to look at that and consider from many different perspectives the asymmetric cost potentially of offering finance to someone in a difficult position, or refusing finance to someone in a difficult position as well. What are the costs in each of those cases if you would like to use an ML model to make that decision?

And so those are things that are hard to do in isolation with a set of data and an optimizer. Those are the things that really need to understand the context of where the data has come from and the context into which the decision's going to be supplied.

Zhamak Dehghani:

That's really interesting because... And I wonder how many companies are actually having that end-to-end reach conversation you just referred to, right? Data scientists with folks that are sourcing the data, with folks that are using the data, folks whose business is going to be impacted by the models.

Because what I see is that a lot of companies are just still struggling in bootstrapping the fundamental platforms and pieces of engineering that needs to be in place. They just collect the data and gather the data, and assure that the data itself has some quality, some level of quality, some level of integrity, some level of freshness that is acceptable. Yet alone, have a conversation that, "Is this data bias or not?" And, "Is this model now making the right decisions within the context that's seen?"

So we're definitely elevating the conversation, but I think the reality is that a lot of companies are still stuck in just bootstrapping and mechanical work that needs to happen. The pipelines and the engineering part of it. We talk about CD4ML, but there are a lot of buzzwords in the industry. We seem to have loved DevOps, and then we invented a whole lot of other Ops vocabulary adjacent to that. DevSecOps and Dev Sec Biz Ops. And now we have DataOps and MLOps. So I thought maybe we can kind of unpack what does DataOps or MLOps mean to you? And how does that map to CD4ML?

Danilo Sato:

The way I interpret as at least the industry trends, it's about bringing the thinking. If we go back to the DevOps idea, right? It's like, we want to bring people that don't work together, closer together. So they appreciate each other better, and we can together build something more. So it's not software developers throwing code over the wall for operations people to run. BetaOps, MLOps is kind of the same. In some cases it's still like that. Right? Like the data scientists team might trying to build more of those somewhere. And they're like, "It's not my job to deploy this to production." There's another, either like a machine learning engineering team, or this is some other engineering team that's going to be responsible for doing that.

And same thing we see, as you said, in like big companies doing data platforms, they might have like full data engineering, their whole job, if you ask them, it's like, "Oh, I'm the ingestion team. I'm just responsible for moving data from A to B. But those teams that are a little bit closer to the operational side, usually, I feel. Like the data engineers potential worry more about those operational characteristics of the pipelines they're building, because there's usually elements of scalability and things like that, that they are already aware of.

So the operational thinking there, is already there. But they're still silo in terms of how is that data going to be used, or how is it going to connect to the business value at the end? So there's still some of these silos. And I feel like these industry trends that are coming up is to try to address that. To show people that actually we cannot disconnect things completely. That yes, there are areas of specialty, but at the end of the day, everything needs to work together and we will have the whole thing. It will be like data, sec, development, ML ops. It should be like one big thing with everything. Because that's eventually what we want. Right?

Mike Mason:

It's a cool business.

Danilo Sato:

Yeah.

Mike Mason:

Biz, tech, money, ops, ML, dev, data [crosstalk 00:10:02]-

David Colls:

Again, like just like the DevOps mindset, I guess, with all those sources of change and that iterative approach, like the development of the model, or the development of an ML solution, doesn't finish on the first release. And often we're using techniques like active learning, for instance, to continue to label data, to feed back into the next version of the model. And so the solution isn't, how do we push a model into production? It's actually, how do we design a user interface, or a user experience, to deliver that to the customer?

But also how do we potentially develop a labeling, or an active learning interface for internal teams, or for labelers, to continue to label the data. And so-

Danilo Sato:

Improve the model, right?

David Colls:

Yeah. Yeah It's actually a whole ecosystem of technology components that need to work in synchrony, in an operational sense. And so it becomes a very big concern. And it should be, it's not a matter of building the operational infrastructure, and then starting to deploy models on it. It's actually an iterative process that cuts across all those concerns over and over again.

Danilo Sato:

Yeah. I think CD4ML, our approach, we, for us, we've been doing this in software for a long time. So like for thought workers, it's kind of second nature, thinking about things in those sense. It's, yes, of course we want to release the new and improved, but that's not how most of the industry thinks of it, and some of how they operate.

So I think CD4ML will be our approach to how those thought works does MLOps or DataOps. This is what we're trying to think, it is going to be a continuous improvement process and maybe the first release is not going to be the best model, but we want to get some feedback, we want to see how the user experience of using that model is going to be, what kind of impact its actually going to have.

And sometimes it's hard to get the data scientist out of the, trying to keep perfecting the model as much as possible, right? There's a tension there in trying not to come up with a better model. We can always get better. But I think we want to improve the process so we can keep getting better and better and better. Right? We don't want to find a final stopping point.

Mike Mason:

So I'm curious about something, we're talking about machine learning models ability to sort of respond to changes in the environment, and all that kind of thing. Do we have a sense of how well ML models responded to the COVID crisis? Because like we saw a lot of empty shelves of supermarkets that were caused by sort of problems in the supply chain.

And my understanding is that that was more about the fact that supply chains are so well optimized that if anybody buys a few extra groceries every week, there's no slack in the supply chain, so you end up with empty shelves. Right? So I don't think that that was a failure of ML models. But I'm curious, do we have any stories of how those models responded to a gigantic change in sort of world situation?

David Colls:

Yeah. I think there was a good example from Amazon in the States where the ability to deliver within a certain timeframe became the number one predictor of whether a customer would buy from a supplier. Whereas previously, it was based on a range of other factors. And I think, yeah, we sort of summed it up as we train computers to behave like people. And then people started behaving differently and the computers, in some cases haven't been able to catch up.

And so this is, again, where an iterative process can help. But if you don't have enough training data, if people's behavior continues to change, you get these sort of non-stationary, as they're called, changes in the data. Then we're either have to find ways to train on smaller data sets, or more weakly labeled data sets, or to be able to dial up and down the ability of a model to influence a process, if that makes sense.

So how much we hand over to automation, and how much becomes augmentation. And so those are all capabilities that speak to having a continuous delivery ability.

Danilo Sato:

Yeah. Your example about toilet paper, like grocery, or like retailers, I think a lot of that planning supply chain happens because it's based on demand forecast, right? So like they have these kind of models. They're trying to predict how much they're going to sell and they buy based on that. And those models are usually... I mean, there's a lot of things that are not really sophisticated, like regression model.

So they learn from past data, which is how much we saw in the past. And I think no one had the COVID trading data sets to show like, how is that maybe going to change when something like that happens. So I think the models wouldn't be able to predict that. But it also highlights, I guess, the importance of having all the controls around there. Right? So like just how do you still keep the supply chain running? There's going to have to have a lot of people. And all the systems around to support, like how do we smooth out this demand?

Because now every store is going to want to order all the toilet paper all at once, and not going to be able to fulfill that all at once. So they have to basically build some compensating systems around those models until the model can actually see a different data sets and pick up the new patterns and start planning again. But yeah, I think that was a bit unpredictable.

Zhamak Dehghani:

So, we talked about kind of CD4ML and maybe the last mile of that... I mean, there's no last mile because it's all a loop. But towards the kind of the training and the testing, and all of that. But before that happens the availability of data at scale, historical data at scale, from rich set of domains, within the bounds of organization or external to the organization, is a foundational piece of it, right?

And for years and years we've had various kind of technology or architectures to make the data accessible. We had data warehouses and then we've had data lakes, and data hubs. And recently we added datamish to the list. And that's kind of one of the items on the tech radar this time.

Which brings attention to, I guess, it's the convergence of engineering practices, and distributed architecture practices, and now data, right? It was interesting that when InfoQueue kind of called out datamish as an architectural trend, it was something that this year, that's something that had never happened because we haven't applied or thought about architectural patterns the same way that we apply them to operational systems like microservices or monoliths and so on. We've never done that to the data management systems.

Zhamak Dehghani:

We've never done that to the data management systems. Data management systems were often operating at maybe a different level. So that's another area that maybe we could talk a little bit, just this whole decentralization approach to decentralizing data identity kind of architectures so that data can be more readily available to the models.

David Colls:

Yeah. I think that we've talked a bit about moving fast and being able to adapt. And I think that the key thing that a central architecture can't do is enable teams to work autonomously and respond to changing conditions in their circumstances. We need teams to be able to own their solutions end to end. And if they now include ML models, that means, as we've discussed, owning all the sources of changes that feed into those ML models, not necessarily building all the technology bits, but having self service access to all of those technology bits that enable them to do that. And to then, as you were saying, when you have a multi domain, large rich organization with many domains of data rich data sets, a lot of the value often comes by combining those rich data sets from different parts of the organization to provide more predictive signals than would be possible within a single domain.

And so how do those teams collaborate in a peer to peer way is the big question we're trying to answer without having to go through a central team that determines what data is available, what schema is made available, what level of quality it will be made available at, how far down the roadmap that data will actually be usable into the future. But rather we need teams to be able to collaborate in a peer to peer way, pretty mediated by a technology platform that enforces the governance framework that's been determined, but teams to collaborate peer to peer, to be able to execute on their plans and achieve their objectives in an autonomous, but coordinated way.

Mike Mason:

Because arguably, that sort of is the reason that people haven't been talking about data architecture so much because the strategy was put all your data in one place, whether that was a enterprise data warehouse or a data lake, like data lake just was changing that underlying tech for shoving all your data in one place and the approach to making it valuable was, "Well, it's all there. So that's where you do your correlation and your creation of value by joining disparate datasets."

I really want to call out that with more of a peer to peer model, that is absolutely challenging decades of thinking because because the lakes didn't really change very much, is that fair?

Danilo Sato:

Yeah. It enables us to collect more data.

Mike Mason:

Yeah, the data lake enabled us, okay, it's all in the Lake and we're starting to store it in a more raw format. So we get some benefits from doing that. But fundamentally we haven't changed the strategy of putting it all in one place and then hoping that good things happen.

Zhamak Dehghani:

I completely agree. And I think maybe there are evolutionary changes there, like how the data gets seen and gets out. And whether there are streams involved and more real time involved, but fundamentally it's been that. It's been moving data from one place to another place and then putting it into one place to give access to other folks. And I think because of the sheer volume of the data and kind of the technical bare middle challenges that we've had in terms of just serving it and moving it and storing it, a lot of the architectural thinking has gone to that technical layer. The grid architectures or the distributed file architectures. Those are physical layer. From my naive and user perspective, those are physical layer architectural concerns that needs to happen. But this logical layer architecture concern, which thinks about the actual user, like as David was mentioning these autonomous teams. And then how do you lay out a logical decomposition around that physical layer to give this sense of autonomy? That's the part that we've missed so far. And it's been one logical component, which is the lake or the warehouse.

Danilo Sato:

Yeah. I like one of the things from that data mesh approach that's really good is bringing that data product thinking, like think of data as a product. And the team is owning the quality for the data because that's another problem. Once we move somewhere else, it's not my data anymore. And then someone else moves somewhere else. And then three layers later, no one understands the data anymore and the quality is bad and we complain and that we have to build projects to try to improve the quality. With data mesh approach is more like thinking of data as a product. You have product team that will be responsible for that quality. One, they advertise for everyone like what the data is and what it is and these are the ways that you can access it. Is it through a stream we publish, some events here that you can consume, or if you need a historical data set, here's some other ways that you can consume. Making that accessible and easy to explore.

It's really good. It is a merge of the architecture from like systems architecture thinking that's coming into the data space that I think, as I said, wasn't there before. Like a lot of data architecture was more about the structuring of the data, the schema modeling, more than data modeling. But I feel architecture thinking like how can we split responsibilities within the data domains as well? How do they align to our core domains of the business? We've got transactional systems and data products that live together in those domains and exposing those data sets. It's a much more rich kind of ecosystem that you have to design or cover. And I think that's evolving, it's converging as well. That's one of the trends that we're seeing.

Mike Mason:

I have a question about something with data mesh, if it's okay to dwell on the topic for a bit longer. So with the data is product, you're creating a data set that others can consume. One of the traditional problems with data is that people start using it and then they expect it to be the same forever and they get upset when you change it or mess with it or do anything different. Does data mesh speak to that in any way and have a strategy for helping people not get quite so stuck on the data that they're receiving?

Zhamak Dehghani:

That conversation that we just had around data as a product, one of the kind of attributes of this data to be a product is providing some level of contract and supporting that contract moving forward is baked into the interface of that data. We often have a lot of conversation is, "What is a data product? What are these nodes on this mesh that is representative of the data?" And it's not just the data itself. It's data, code, and then interfaces to get that data. Code that is actually processing and generating and maintaining that data. Interfaces to get access to them. And ultimately data is that the content that we are sharing. And those interfaces can and should apply the lessons that we learned designing services and designing APIs, not necessarily to abstract the data on another layer, but support forward compatibility through indirection, for example.

So having schemas as part of those interfaces that defines what is the schema, what's the version. Or the URL that gets you to that underlying file. So maybe you may still provide a file based access or SQL database table access to the end user, but you will provide that through an interface. We'll give you a level of indirection to say this URL, we'll take you, we'll give you all the information about schema, what version of the data, and where you can get it. And if that version changes, then that URL will change. URL will take you to somewhere else. So you have ability to gracefully change the data and gracefully retire the old versions, if you don't want to support. Which is essentially we did in kind of microservices world or services world.

David Colls:

Said it better than I could. Obviously we can't deal with data that doesn't change, we need to be able to work with data that changes. And so the argument that the data shouldn't change, we can't work with that. And so then if we accept the things change just from exit, we've got establish patterns for the services provided by microservices changing, the services provided by a data product will also change over time, but we have established patterns for tracking that. I guess data mesh calls out, CD a number of different decompositions of continuous delivery. CD for the infrastructure. CD for the software. CD for the data and CD for the ML models as well. And so for your CD for data, consider that as being able to track all the changes that that data has been through and be able to reproduce that if required.

And so this allows us to deal with changing data. And in much like we can have contract testing for services, we can have contract testing for data. It becomes slightly less deterministic. We might be looking at distributions, but at the same time, we can set expectations around what your expectations of a data set as a consumer or as a producer, that data set as a product. And that can feed into the definition of the data as a product.

Zhamak Dehghani:

I'm really excited about this whole new category of tests that we would write to decide where you are on that spectrum of integrity or quality. It's not as binary as perhaps services and capabilities. As you mentioned, it's more of a distribution analysis, and it's a spectrum. It suggests things of yet need to be evolved and I think defined within data mesh architecture. I think we've figured out the principles, we've figured out some of the architectural patterns, but we still have to figure out some of the details. For example, we talk about similarly to SLOs, or service level objectives, with usually very clear objectives of quality or measures of guarantees of quality in terms of services. Uptime and response time and delays and that sort of things. But when it comes to data, what does that mean to have a set of, I don't know, what's the name for SLOs or something that represents SLOs for datasets so that we can give some confidence to the user and we can guarantee those.

David Colls:

Yeah, it's a somewhat unique challenge in that if the data characteristics change, the solution will keep working. It's just that it will produce more of those wrong answers and then less and less clear. That's right. Unless it crosses a discontinuity like that. And suddenly there's no toilet paper on the shelves or suddenly a large number of pallets of toilet paper turn up unexpected.

Zhamak Dehghani:

I think in that space of decentralization or distributed action, it's just we have another item on the radar called decentralized identity, which I was fascinated by it. There isn't a whole lot of applied content yet. It's early days, but it might be an interesting one to cover and how it's applicable here.

Danilo Sato:

The blip is interesting because it's talking about like identity specifically. Like how can we have a decentralized identity, either people or organizations when you're in a decentralized architecture, there's no single kind of source of truth for something that's something that a lot of traditional data management approach is like, we need to find single source of truth with things, but like in a distributed architecture, maybe there's not a single source of truth, but we still want to have a way to interoperate with each other and kind of like federate those ideas. So I think the blip is talking about some of the upcoming standards that are coming up in the industry to try to formalize how that works. But I think that also brings that other challenge that we see with data mesh kind of approach.

While trying to model these domains, not just identity, but like there's lots of business concepts that might have multiple representations within the organization. So a good example is a customer, it's probably going to be represented across the whole, whatever your business is. You're going to have multiple views of that customer. And that, again, this industry is trying to like, we need to have a single view of customer, but in a federated kind of architecture, you want to support different things that you want to do with the customer, but you still want to be able to bring that together somehow and make sense of how that richness has evolved across.

Again, it's a push and pull kind of system between do we model that in a central place and make that the single source. And then everyone is going to have to come here to get the customer information, and then we're going to broker and we're going to be the owners of that. Or are we going to allow this richness to exists across the architecture, but we'll come up with a way to be able to federate that concept, that if someone else later wants to join them, they can actually say, "Oh yeah." Like in the supports domain, maybe the customer uses the complaints that we got. But maybe from like, I don't know, the transactional side, these are the sales that we've got with our customer. These are the products that they like to buy and things like that. You're allowed to have these multiple views of that core entity throughout the architecture, but at the same time, being able to connect those dots and still be able to join them when you need them.

Zhamak Dehghani:

Yeah. I think this decentralization has two interesting of aspects to it. One is what you just described which is these identities are spread across different domains and maybe just some fields or different profiles or metadata associated with them in whatever domain that they're in. In a marketing system, you have some information about your customers. Your order management, you have a different sort of metadata about your customer. And ultimately I'm sure David would love to join all of this data to actually create a feature set, to train models to get toilet paper to them. We're going to abuse this toilet paper example. I'm so sorry. It's just stuck.

But then the other, I think, interesting aspect of it is the entities themselves, like the customers or the patient, those core entities, they own, they generate those identities that live across multiple domains and multiple organizations. And then they have it in a cryptographic way be able to authenticate that this is them to prove who they are, which opens a whole other set of possibilities of sharing data across organizations. Like your health data that you can share and move from one provider to another.

David Colls:

And even within organizations as well. When they're big enough, as Danilo said, we've had emergent, distributed identity, a decentralized identity in the first sense that you described as a result of different domains establishing their own customer management systems. And then we have this push to somehow centralize, create a common identify the single view of customer. Shouldn't we put that into the customer's hands to say, do I want to be, as you said, in the second interpretation of the decentralized entity, does the customer want to be identified as the same customer holding multiple products with different parts of the organization, for instance. Or do they want to separate? And this also comes back to one of those questions of the cost of different types of mistakes for predictive models in a single view of customer scenario.

David Colls:

There are different costs for accidentally over matching and accidentally undermatching customers in that scenario. So if you have decentralized identity as a deliberate architectural construct that puts that power back in the hands of the customer but in a way that makes it easy for organizations to provide services to the different identities that a customer chooses to adopt. That will be a real enabler for dealing with a lot of the complexity that we're seeing as a result of the emergent decentralized identity phenomenon.

Mike Mason:

I'm curious, on a related note, where we're at with the ability to sort of share data, but then rescind that later. So there's this sort of notion that I want to be able to share some data with you, but only until I decide I don't want to share it with you anymore and to my naive mind, once you've given someone some data, well, they've kind of got it, right? Are we approaching mechanisms to be able to sort of take that back and rest control back again?

Danilo Sato:

I think so, some of the regulations, whether it's some GDPR and I forgot the one in California, CCPA. There's some regulations that are trying to give power back to consumers. The rights to be forgotten. I want to be able to know what is all the data you have about me? And if I choose to, I want to ask that you forget about it. So there is at least from the regulation side, some progress I think, being made but I agree with you as a technologist where I skeptical, unless we have some guarantee that the data has actually got removed or some evidence then you can never be sure that they actually deleted that whole information they had and a lot of the systems are not even built to be able to do that. It might be a big effort just to try to be able to provide you that feature.

Mike Mason:

Yeah, it's not like an episode of Mission Impossible where there's this tape will self destruct in 10 seconds. You can't really do that with data so it's more of a policy and regulation thing than a technical one at this stage.

Danilo Sato:

Yeah, and I think companies are starting to, especially with this regulation, they're trying to build that discipline and build the tools to be able to do that but how well the implementation goes and as I said, probably it's going to require effort to enable systems to be able to have that as a thing.

Zhamak Dehghani:

Yeah, I mean, until data sovereignty becomes a real thing then you own your identity, your data, you generate the keys that represent you and you destroy them. You can track and destroy them. I think that's just as [inaudible 00:54:04] regulation and some bandaid to be able to just enforce those regulations when somebody pulls the trigger to do that. It's not a by default state of the system and I think when we get to that point, we have a very different model of economy where things would look very different. If we own our data and if we can pull that data back or we can share that data by choice, then we wouldn't have a free internet where we pay with our ads and purchase of that so I think that's a different episode to talk about.

David Colls:

Yeah, I think, it's being driven by regulations at the moment, but as Danilo said, I think we're seeing more provision in the technology to be able to surgically delete primary data products and all the derived data products within an organization as well.

Danilo Sato:

I don't want to derail the conversation too much, but one pet peeve of mine now, a lot of the implementation is like now every website you go, and they're going to ask you for cookies, and to give consent and there's like five pop pops before you can get to do anything. From a user experience perspective, that's horrible. And it would basically train people to ignore that and just accept them, which goes back to what it was before so we put a lot of effort and we didn't think about the customer experience and giving consent about how the data is going to be used when we implemented such a bad way that we trained people to ignore it. That's a really bad outcome, even though the original intent might have been good.

Mike Mason:

I mean, if I was a company that has spent millions of euros implementing GDPR only to have every single person just click through it and not read it and actually it be a piece of legislation that goes nowhere, I'm supportive of things like GDPR, by the way but you can see how organizations would be upset at the expense that they had gone to for the apparent little gain.

Zhamak Dehghani:

Perfect. So we, just to kind of recap, we talked about the whole value stream of data gathering and collection and sharing, and then autonomous teams, both using the data as well as putting the data into action through machine learning and then continuous and intuitive way of getting those machine learning models into production and observe them and use them. So when we think about this full value stream, to me, it feels like one continuum of processes and interoperable processes and systems. When we look at the industry and look at the tools, it feels like we either have a very data centric of the world. It's all about pipelines and the data, or we have a very machine learning and AI centric view of the world, which is around, data, we get it and then we put in a feature store and then what we use is from there and we generate APIs or another set of data.

There are some shifts that feels like the... Remove this lens across this spectrum. So for example, we have a blip on the radar, a BigQuery ML that says, "Well, on this spectrum, let's bring now the machine learning to the data." So we just run machine learning within the context of SQL where the data exists. I wonder how you see that. What's your point of view on that spectrum and the shifts you've seen industry and the tooling that you've seen in industry to talk about and if you have any perspective on the BigQuery for ML, which is an example of that on our Radar.

Danilo Sato:

I think that it's specific on big query ML, it's a way to... I see a lot of tools making machine learning more accessible, and that's a good example of something, right? So for people that maybe, you're a data analyst and you're trying to learn more about data science and create more models and you're very used to working SQL BigQuery as a good tool to handle large data sets, so they're trying to make it more accessible, trying to bring the machine learning tool sets to an environment that's more familiar to you. There are a lot of tools, I don't know if you have it in the radar, but people talk about Ultimate Mail, trying to build platforms and systems and maybe you do some level of the training or the feature engineering for you, and then you assess the model, so try to make it more palatable for people to start experimenting and building models. So I think, there's definitely... Those help us bridge that gap between the two and make things more accessible.

David Colls:

Yeah, we've seen a range of different tools being used in different scenarios as well. DataRobot is another tool that we've seen deployed... that has some components of AutoML in that you present training data and it has some components of CD4ML in that it will then allow you to deploy a model and track the performance of that model in production and potentially retrain if required. But, there's still, in terms of the discipline of continuing to evolve solutions in a customized way, you still need to be able to do that engineering across the whole stack. So the features that that feed into DataRobot may need some significant re-engineering. If you were to make a slight change in the way the model does its predictions or how that's presented to customers for instance, but it does at the same time, it exposes a lot of capabilities to a wider group of users within the organization.

And so I think that the challenge becomes the balance between how repeatable this use case, what are the risks of getting it wrong, and where are we? To go back to an old software metaphor, the utility strategy dichotomy, is this a utility capability to be able to do some forecasting or do some natural language processing in a way that makes analysts in an organization more productive? Is that a utility thing like email or docs? Or is it a strategic thing that's a business differentiator, in which case you're not necessarily going to get an off the shelf solution, but you actually need to be able to line up all of your organizational capabilities behind the bespoke technology development effort.

Zhamak Dehghani:

It looks like we are back where we started. CD4ML or CD for data, so you see, it's all one continuum, or if there are tools that try to abstract complexity and make some of the steps of this process more accessible, as long as they don't violate the principles of continuous delivery, it's something we really embrace.

David Colls:

Absolutely, we frequently use off the shelf machine learning as a service or pre-trained models in that first iteration of an ML solution. And so, that can be a really easy way to get started, but it doesn't provide that strategic differentiation so it depends on where you're going to sit on that spectrum. You can validate that it's technically achievable and you can start to get some sense of the value of deploying a model without a lot of sophisticated engineering behind it, and then iteratively improve the performance of that model and the differentiation it required.

Danilo Sato:

To your point about convergence, I think absolutely. Even after the model's in production, we want to have the monitoring observability place, how is it actually performing against real data? And that creates more data that you have to manage as well. So, it's really hard to dis-associate those two things, like data and ML or AI. They are very intertwined and at least from a strategy point of view for any company, I'd say, you need to think about them together and how they connect. There are areas of specialty, of course and people might be interested more in one area versus the other, but not create them in silos and totally separate because they are intertwined.

Zhamak Dehghani:

I absolutely echo what you just there, I see these platform wars within the company, oh, this is the data platform. No, this is the ML platform, and you're right. While there are specialties and specialized capabilities, there's an underlying common platform capabilities that needs to cater that continuum of iterative data to ML to have data again, flow. Perfect. So, we could talk to you for hours and hours, but it's very late for you. I think he's just about to start his day, so we're going to have wrap up maybe our discussion with some thoughts or reflections individually from you. What do you expect to see within the industry? What sort of change do you predict to see? Anything you'd like to leave the audience with.

David Colls:

I think we... And I'd like to see more of this ML move out of just the technical solution and into more of a product design mindset and so, considering what product features would benefit from leveraging ML or what whole product classes would benefit from leveraging ML, and then how to approach building those digital products in a safe way. Safety innovation, so being able to anticipate the failure modes and the biases that may be introduced in a solution, but also be able to make small incremental changes with all the technology foundations that support that, to be able to more effectively, bring these products to market in the way that we've developed that maturity in the basic digital product development space, to be able to bring that to the digital products admitted by data and machine learning as well.

Danilo Sato:

Yeah, for me, I think what I'd like to see is maturing in the tools to support these other things that we talked about today, running all the way from how we manage data and how do we train the models and how do we assess the quality and get those into production. I feel like there's a Cambrian explosion tools now where everyone is trying to solve the problem again. So the problem is there, there's lots of contenders. I think there's not a clear standard or winner that emerged yet. So, I would want us to move as an industry as, exploring all this is very good because that's what gives us more experience and perspective, but have that standard body of knowledge kind of emerge and maybe the tool mature to a way that we can actually implement this more easily without having to go back to the principal and try to apply the theory in a new environment. Ideally, the tooling would be more mature in a way that we can just start getting things going without having to build everything from scratch.

Zhamak Dehghani:

Yeah, and I think one key attribute of that tooling or thinking about those tools as ecosystem thinking, there's not going to be one system to solve it all and how we can have standardization or technology that allows that ecosystem play between different tools. Well, thank you so much both for joining us. Thanks Mike, for asking all the good questions.

David Colls:

Thank you.

Danilo Sato:

Thank you for inviting us. It was a good discussion.

Mike Mason:

Yeah, thank you very much for having us.

View full transcript

View less

More episodes

Episode name

Published

Architecture antipatterns and pitfalls: Good intentions, bad habits and ugly consequences

January 08, 2026

Are we entering the 'age of intent' in digital interaction?

December 23, 2025

AI-assisted software development in 2025: Inside this year's DORA report

December 11, 2025

We still need to talk about vibe coding

November 27, 2025

How developers can get the most from new AI coding workflows

November 13, 2025

Themes from Technology Radar Vol.33

October 30, 2025

What does an AI strategy with humans at the center look like?

October 16, 2025

What we're talking about when we talk about context engineering

October 02, 2025

Mean time to shared understanding: Bridging the gap between citizen developers and developers

September 18, 2025

Organizational design and Team Topologies after AI

September 04, 2025

Context engineering: Tackling legacy systems with generative AI

August 21, 2025

Navigating AI opportunities at MYOB

August 07, 2025

Caring about documentation in the LLM era

July 24, 2025

Why the tech industry needs Expert Generalists

July 10, 2025

The three new fallacies of distributed computing

June 26, 2025

MCP and SRE: Why the future of IT operations is agent-driven

June 12, 2025

Unpacking Google I/O 2025

May 29, 2025

Accelerating mainframe modernization using generative AI

May 15, 2025

Exploring the fundamentals of software engineering

May 01, 2025

Themes in Technology Radar Vol.32

April 17, 2025

We need to talk about vibe coding

April 02, 2025

Infrastructure as code in 2025

March 20, 2025

How fitness functions can help us govern and measure AI

March 06, 2025

Architecture as code

February 19, 2025

Decoding DeepSeek

February 06, 2025

AI testing, benchmarks and evals

January 23, 2025

Exploring the intersections of software architecture

January 09, 2025

Who should make software architecture decisions?

December 26, 2024

Generative AI's uncanny valley: Problem or opportunity?

December 12, 2024

Using generative AI for legacy modernization

November 28, 2024

Data contracts: What are they and why do they matter?

November 14, 2024

Themes from Technology Radar Vol.31

October 17, 2024

Build Your Own Radar: Using the Technology Radar as a governance tool

October 03, 2024

Exploring DuckDB: A relational database built for online analytical processing

September 19, 2024

Software service granularity: Getting it right

September 05, 2024

Measuring developer experience

August 22, 2024

How can AI support designers?

August 08, 2024

Sensible defaults: A way to think about our technology practices

July 25, 2024

Tracking technology stacks, practices and experiences across teams

July 11, 2024

Inside Bahmni: An open-source digital public good

June 27, 2024

How to assess your organization's security maturity

June 13, 2024

Continuous delivery vs. continuous deployment: What should be the default?

May 30, 2024

Themes from Technology Radar Vol.30

May 16, 2024

Building at the intersection of machine learning and software engineering

May 02, 2024

Refactoring with AI

April 18, 2024

How to measure your cloud carbon footprint

April 04, 2024

Technology through the Looking Glass: Preparing for 2024 and beyond

March 21, 2024

Diving head first into software architecture

March 07, 2024

Exploring the building blocks of distributed systems

February 22, 2024

Software-defined vehicles: The future of the automotive industry?

February 08, 2024

Beyond the DORA metrics: Measuring engineering excellence

January 25, 2024

Asynchronous collaboration: Getting it right

January 11, 2024

Looking back at key themes across technology in 2023

December 28, 2023

Leveraging generative AI at Bosch

December 14, 2023

Jugalbandi: Building with AI for social impact

November 30, 2023

AI-assisted coding: Experiences and perspectives

November 16, 2023

What's it like to maintain an award-winning open source tool?

November 02, 2023

Engineering platforms and golden paths: Building better developer experiences

October 19, 2023

Managing cost efficiency at scale-ups

October 03, 2023

Exploring SQL and ETL

September 21, 2023

Driving innovation in radio astronomy

September 07, 2023

XR with impact: Building experiences that drive business value

August 24, 2023

Leadership styles in technology teams

August 10, 2023

Making design matter in technology organizations

July 27, 2023

Generative AI and the future of knowledge work

July 13, 2023

Scaling mobile delivery

June 29, 2023

Making privacy a first-class citizen in data science

June 15, 2023

Multi-cloud: Exploring the challenges and opportunities

June 01, 2023

Scaling up at Etsy

May 18, 2023

TinyML: Bringing machine learning to the edge

May 04, 2023

The weaponization of complexity

April 20, 2023

How we put together the Technology Radar

April 06, 2023

Inside India's Drug Discovery Hackathon

March 23, 2023

Serverless in 2023

March 09, 2023

My Thoughtworks journey: Rebecca Parsons

February 23, 2023

How to tackle friction between product and engineering in scale-ups

February 09, 2023

6 key technology trends for 2023

January 26, 2023

Tackling system complexity with domain-driven design

January 12, 2023

Shifting left on accessibility

December 29, 2022

Data Mesh revisited

December 15, 2022

Low-code/no-code platforms: The 10% trap and the limits of abstractions

December 01, 2022

Welcome to the fediverse: Exploring Mastodon, ActivityPub and beyond [Special]

November 24, 2022

Rethinking software governance: Reflecting on the second edition of Building Evolutionary Architectures

November 17, 2022

Reckoning with the force of Conway's Law

November 03, 2022

Exploring the Basal Cost of software

October 20, 2022

Why full-stack testing matters

October 05, 2022

Acknowledging and addressing technical debt in startups and scale-ups

September 22, 2022

XR in practice: the engineering challenges of extending reality

September 08, 2022

Agent-based modelling for epidemiology: EpiRust and BharatSim

August 19, 2022

Mastering architectural metrics

August 12, 2022

Building a culture of innovation

July 28, 2022

Starting out with sensible default practices

July 14, 2022

Better testing through mutations

June 30, 2022

Patterns of legacy displacement — Part two

June 16, 2022

Patterns of legacy displacement — Part one

June 02, 2022

Mitigating cognitive bias when coding

May 19, 2022

Following an usual career path: from dev to CEO

May 05, 2022

Software engineering with Dave Farley

April 21, 2022

Tackling bottlenecks at scale-ups

April 07, 2022

Coding lessons from the pandemic

March 24, 2022

Is there ever a good time for a code freeze?

March 10, 2022

Navigating the perils of multicloud

February 25, 2022

Compliance as a product

February 10, 2022

The big five tech trends for 2022

January 27, 2022

Fluent Python revisited

January 13, 2022

Creating a developer platform for a networked-enabled organization

December 30, 2021

The art of Lean inceptions

December 16, 2021

The hard parts of data architecture

December 02, 2021

TDD for today

November 18, 2021

You can't buy integration

November 04, 2021

The rise of NoSQL

October 21, 2021

The hard parts of software architecture

October 07, 2021

Machine learning in the wild

September 24, 2021

Delivering innovation at scale

September 09, 2021

Jim Highsmith: a 54-year agile journey

August 26, 2021

Securing the software supply chain

August 12, 2021

Making retrospectives effective — and fun

July 22, 2021

Patterns of distributed systems

July 08, 2021

Refactoring databases — or evolutionary database design

June 24, 2021

Making developer effectiveness a reality

June 10, 2021

Team topologies and effective software delivery

May 20, 2021

How green is your cloud?

May 07, 2021

Green software engineering

April 22, 2021

Twenty years of agile

April 08, 2021

Talking with tech leads with Pat Kua

March 25, 2021

My Thoughtworks Journey: Patricia Mandarino

March 11, 2021

Exploring infrastructure as code

February 25, 2021

XR in the enterprise

February 11, 2021

Getting to grips with data visualization

January 21, 2021

Computational notebooks: the benefits and pitfalls

January 07, 2021

The architect elevator

December 24, 2020

The future of Clojure

December 10, 2020

The future of digital trust

November 27, 2020

Integration challenges in an ERP-heavy world — Pt 2

November 12, 2020

Democratizing programming

October 28, 2020

Integration challenges in an ERP-heavy world

October 16, 2020

Models of open sourcing software

October 01, 2020

Applying software engineering practices to data science

September 17, 2020

Using visualization tools to understand large polyglot code bases

September 03, 2020

Machine learning in astrophysics

August 20, 2020

Programming languages geek out

August 06, 2020

Observability does not equal monitoring

July 23, 2020

Working with 50% of code in the browser

July 09, 2020

Realising the full potential of CD

June 25, 2020

Testing the user journey

June 12, 2020

Continuous delivery in the wild

June 01, 2020

Lessons from a remote Tech Radar

May 13, 2020

The future of Python

April 30, 2020

A sensible approach to multi-cloud

April 17, 2020

Digital transformation: a tech perspective

April 02, 2020

IT delivery in unusual circumstances

March 20, 2020

Continuous delivery for today's enterprise

March 06, 2020

Fundamentals of Software Architecture

February 21, 2020

Cloud migration — part two

February 10, 2020

The price of reuse

January 24, 2020

Towards self-serve infrastructure

January 13, 2020

Martin Fowler: my Thoughtworks journey

December 27, 2019

Building an autonomous drone

December 13, 2019

Cloud migration is a journey not a destination

November 28, 2019

Getting to grips with functional programming

November 14, 2019

Compliance as code

November 01, 2019

Data meshes: a distributed domain-oriented data platform

October 18, 2019

Edge — a guide to value-driven digital transformation

October 04, 2019

Tech choices: CIO or CTO?

September 20, 2019

Microservices as complex adaptive systems

September 05, 2019

Supporting the Citizen Developer

August 22, 2019

Getting hands-on with RESTful web services

August 08, 2019

Zhong Tai: innovation in enterprise platforms from China

July 25, 2019

What’s so cool about micro frontends?

July 11, 2019

Unravelling the monoglot monopoly

June 27, 2019

Breaking down the barriers to innovation

June 13, 2019

Delivering strategic architectural transformation

May 30, 2019

Exploring programming languages via paradigms vs labels

May 16, 2019

Multicloud in a regulated environment

May 03, 2019

Can DevSecOps help secure the enterprise?

April 18, 2019

A11Y — Making web accessibility easier

April 04, 2019

Continuous delivery for modern architectures

March 21, 2019

Delivering developer value through platform thinking

March 07, 2019

Architectural governance: rethinking the Department of ‘No’

February 21, 2019

Serendipitous Events

February 08, 2019

Diving into serverless architecture

January 24, 2019

Seismic Shifts

January 10, 2019

Understanding bias in algorithmic systems

December 28, 2018

Microservices: The State of the Art

December 14, 2018

Evolving Interactions

November 29, 2018

The state of API design

November 15, 2018

How we build the Tech Radar

November 01, 2018

IoT Hardware

October 18, 2018

Continuous Intelligence

October 04, 2018

Distributed systems antipatterns

September 13, 2018

Agile Data Science

August 23, 2018

Solutions

Industries

Publications and Tools

All Insights

Applying software engineering practices to data science

Brief summary

Check out the latest edition of the Technology Radar