The rise of NoSQL

Podcast host Rebecca Parsons and Zhamak Dehghani | Podcast guest Martin Fowler and Pramod Saladage

October 21, 2021 | 37 min 48 sec

Read Transcript

Listen on these platforms

Brief summary

In the past decade, NoSQL has gone from being an interesting experiment to becoming business critical. We catch up with Martin Fowler and Pramod Sadalage, co-authors of NoSQL Distilled, to understand why the database technology took off and where it’s proven its capabilities in the enterprise and how thinking around issues such as persistence models has evolved.

Full transcript

Rebecca Parsons: Hello, everyone. Welcome to the Thoughtworks Technology Podcast. My name is Rebecca Parsons. I'm one of your co-hosts, and I'm here with Zhamak Dehghani. Hello, Zhamak.

Zhamak Dehghani: Hi. Rebecca. Hello, everyone.

Rebecca: We are joined today by two guests Martin Fowler and Pramod Sadalage both of who been with us be before. Today, they're going to be talking about a book that was published just under 10 years ago called NoSQL Distilled. Martin, Pramod, thank you for being here and welcome.

Martin Fowler: Happy to be here.

Pramod Sadalage: Thank you, Rebecca and Zhamak.

Rebecca: Let's start with something simple. What really does NoSQL mean? What is the scope of what you're talking about in this NoSQL Distilled book.

Martin: Well, starting with what NoSQL means. Originally, it was no more of than a hashtag for a meetup. A bunch of people wanting to get together, set up a meetup meeting wanted some little short hashtag for the meeting, and they picked hash NoSQL, but rapidly, that turned into a whole movement of people. This is around the late 2000s of people exploring storing data in something other than a relational database.

I can't remember who was at the very first meetup, but it was a number of people including, with names of databases, some that we're still familiar with, and some that died of death. Then other people got in on the scene as well and said, "Hey, we're interested in non-relational databases and stuff." That was a whole NoSQL thing. There was controversy over the time about whether it meant no SQL as in negative to SQL or a N. O. SQL meaning not only SQL which doesn't really matter because most people who talked about a NoSQL database were explicitly saying something other than relational.

What we were interested with doing the book was just trying to explain what we saw going on in that space particularly from the background of people who were actually quite comfortable with relational technology. Pramod knows more about databases than most people who've ever lived, I think. I've always been comfortable with relational databases, but we're also aware of the limitations, and the fact that sometimes it was something alternative that we could consider. What we wanted to do was provide a brief guide to what that space looked like at that time.

Pramod: That's a good summary, Martin. There were also a bunch of projects we were doing at that time if I remember correctly. We were exploring like MongoDB had just come around, there was GraphDB, Neo4J and we had some projects and we are thinking like, "What are the design trade trade-offs of using one or the other? What situations would drive you to use something like a document database or something like graph database, and how would you make those choices?" Would you just give up to relational databases or just pick up like say some document database, or is there some use case where I would use both?

One would be probably for story financial transactions, and the other probably to store some content that just shows up on the webpage and things like that. That's where it started. Then the whole notion of NoSQL is just half of the title. The other is the Polyglot Persistence angle of that is like, "Should I just stick with one or use more than one type of database in a given enterprise or in a given application?" That's what we are trying to explore and show. At that time, there are four big major ones, like key value, document, column, or family and graph databases. We explore those four different types of databases, and how you would use them. We are able to use them and things like that.

Martin: I would say the Polyglot Persistence is really the key point. The idea that instead of the view that was pretty much commonly held before then that whatever data you have you stick it in a relational database. Your company's bought Oracle or DB2 or whatever, stick everything in there. Rather, you should think about, what would be the best data store for the problem. What's the right data model? Is the relational model a great data model for lots of data, but not for others? What is the right fit of data model? Then what's my best access pattern?

Particularly, at that time, as we were shifting into a world where instead of a single big server handling all our data, we were having this idea of lots and lots of servers often in unreliable situations, a much more distributed situation. Relational databases certainly at that time were not built to go across a distributed unreliable network. Many of these NoSQL databases were explicitly designed for that kind of situation.

It requires people to think much more about what we want to do with data and also the fact that even within a single project scope, you might have some data you might want to store some way and some data you might want to store another way, because of how you want to model it and how you want to access it, how available it has to be, what your consistency requirements are, et cetera.

Rebecca: Well, that's one of the things that I always felt when you look at the way persistence, in general, has matured or evolved. There were object databases, but instead of object databases actually becoming legitimate contenders in their own right, basically, everybody just said, "Okay, let's write an object-relational mapper because, of course, our data still has to end up." It felt like what shifted that thinking away from, "Okay, I'll just put a layer between how I want to represent my data and the relational model to allow me to continue to use that."

What changed was that need for massively distributed databases and the credibility, if you will, that came from many of the internet giants saying, "Actually, no. I'm not going to use the relational model to store my stuff because I can't get the level of scale." How much do you think the credibility that came along with the Googles and the Amazons looking at a different model? Do you think that was why that came to be or was it really just the technological limitations of not being able to get the scale across the networks, et cetera?

Pramod: I would say maybe two different things that came around at the same time. One is the Googles and the Amazon and Yahoo and the related companies putting on paper showing how these things can be done, like the whole Hadoop and MapReduce and all the other related stuff. The file systems and things like that papers about which will be written. There is the Cap theorem that was famously written at that time that could also argue about which two of those either consistency or availability of partition tolerance you care about and things like that.

The other, I would say, a subtle design revolution in some ways was about the notion of don't reach into other people's databases. That was one of the reasons why object databases and things didn't really take off is they were all talking to the same thing and people are trying to see how other objects can access those databases and things like that. One of the patterns like enterprise database-based integration that Gregor Hohpe talks about in his book was the concept of everything can have its own data store and integrate via APIs and not via talking to a database. Once you take on that philosophy, it liberates you from the concept of having to make this database readable for everyone.

Instead, it could just be the API that is readable for everyone, and I can use whatever I want behind the scenes. It could be an object database, or a graph database, or something else. As long as I can provide the data using an API, then I'm free to manage that data however I wanted to. I think big companies, internet giants, actually exploited that concept itself to get to a point where they can give out data using an API instead of other services reaching into the databases directly.

Because I think one of the powers of relational database is if anybody actually wants to explore your database, they can without knowing the metadata, without knowing your schema, without knowing-- They can literally go explore and find out things on their own, and that's a power of relational database, but at the same time, it also stops you from doing other things at the data layer. Martin?

Martin: Yes, when you say that it reminds me I was at a Foo camp workshop in the mid-2000s and Jeff Bezos was there. He was less godlike then than he is now, of course, but I remember him distinctly saying something like, "Oh, for 80%, 90% of what we do at Amazon DBM is fine, we don't need Oracle. DBM, for those who don't know, was a really basic key-value store that came with Unix Operating Systems since about the year dot.

If you think about it for a lot of what organizations like that do, a key-value store is actually most what time, what you want, if you want to look up an item in the catalog, your key-value store get the key, go to the item in the catalog. Want to find out about an order, get the order number, look it up in a key-value store. In fact, we've seen that one of our best examples of using NoSQL databases is with a big Amazon-like retailer in Europe, where we are using Mongo and it was very effective in that situation. Because, again, a key-value store in that situation works really well for the problem you have to hand.

I also think it's no coincidence is from outside that also one of the things that Amazon is well known for doing was, of course, hiding its data storage behind services and going with APIs and service level integration rather than using the database as an integration mechanism. In fact, one of the things I think at Thoughtworks that pretty much all of our senior technologists have been battling against the whole I been here is the problems that occur when people use a database as an integration mechanism and shared databases are really a terrible integration route, but unfortunately, they route a lot of organizations went down that path during the '90s and 00s.

Zhamak: Perhaps another underlying trend that hasn't stopped or hasn't slowed down has been the digitalization. That's every touchpoint, every process, everything that we do is turning those interactions into data and data of a very diverse nature as well. The scale of diversity of the data has evolved in a way that we can't just put everything in a relational structure and we started exploring with nester trees and documents and graphs and relationships and time series.

It's just, this has grown into so many different fragments of non-relational expression of data. I really liked, Martin, what you said there was that this is about you separated the two concerns. One was the storage, how we are storing and modeling the information in different modes, and then the Polyglot nature of storage. Then also the multimodal access, how we are actually accessing the data.

If I may refer back on the title that you chose, NoSQL, I think we continue to just grow different modes or modality of storage and modeling of the data, but it seems like we're all also at the same time converging back on SQL as an interface. I wonder what your observations are in SQL is just this evergreen way of accessing data and what does that mean? What is that telling us?

Pramod: At least the way I think about it, is it's telling us that SQL is this ubiquitous language. Other than the whole notion of storing data in a relational model, SQL was always a query language that was very ubiquitous and anybody could learn. In one of the tops, I showed my daughter typing on a keyboard and I said, she's typing select stuff from some table because nothing else, people can at least do a select stuff from a table without training and with our thing.

There are so many tools that support the concept of SQL, like for reporting analytics and bunch of things. What's happening is lot of database providers, either relational or non-relational, are trying to provide that interface so that it's easy to use. In underneath, the storage may be something else like CQL, the Cassandra Query Language, is not necessarily standard SQL, but it's trying to mimics SQL so that the usage is easy for people to pick up and that kind of stuff. While the storage is still the column family storage that it have.

Similarly, there are lots of other types of product that have come about that are trying to mimic the ease of use of SQL while still providing the flexibility of storage and the flexibility of distributing your data at the same time. I think they're trying to get to the transition for a developer or analytics portion or a data analyze make that easier for them to transition to a different product, different technology while at the same time, give them options on storage and things like that.

Zhamak: SQL is simple, but how far will it take you? At the end of the day is an Algebraic language. We're just running algebra and you say, this transition that we're seeing from mathematics to computation from writing this kind of statement that can get so nested and hairy and hard to understand to writing algorithms and programs that process the data. SQL just seems to be a nice tool to use but it won't be the only tool to use because with that simplicity comes a certain level of limitations as well.

Martin: I have mixed feelings about this actually though. I think SQL is very good at handling a certain shape of query but it breaks down really badly as you begin to move outside its area, particularly when you start having nested queries and things like that. SQL can get horrendously complicated. People who are good programmers struggle with complicated SQL expressions. One of the things that really reminds me of this is I've been doing a bit of data analytics using the R programming system. R has this library, very vital library to using R properly called dplyr, D-P-L-Y-R, which basically allows you to build pipelines of operations on tables.

Those operations, some of them are familiar to any programmer who does less processing with filters and maps and reduces has those but also allows you to do joins. You can construct really powerful expressions by using pipelines in this way. I find it way easier to work with SQL for more complicated cases. If I'm doing what's a simple filter project operation, yes, SQL can work quite nicely and probably maybe one level of groups. That's not too bad, but when you do anything more complicated, then things start breaking apart and then the pipeline approach begins to be a lot more attractive.

What's true in both those cases is you're operating on relational data. If you're operating on effectively tables, then the mechanisms and the way that you can combine from different places by using joins is a really nice mental model to operate with. Of course, when you're not working with something that goes nicely with tables, like hierarchic structures, then suddenly things again, start flying apart. One thing that relational database has never been very good at is dealing with hierarchies, like parts breakdowns and things like that. They struggle because the model doesn't fit the data.

I think a large part of this is recognizing where does the model fit the data, as well as while using SQL or some other mechanism in order to assemble that data. Of course, we have other query languages now like GraphQL, things of that kind. How much data access these days is key-value lookup because that's still one of our best ways to get at a piece of data.

Rebecca: The book was published in late 2012/2013. It's been around for a while and the persistence landscape has changed quite a bit. What do you think is different now and how might that affect the trade-offs that people are making with respect to the choices they're making about where I am going to persist a particular piece of information and how will I go about retrieving it and using it?

Pramod: Back in maybe 2010 to '12, before the book was actually written, we're dealing with this stuff. Cloud providers had relational databases as a service, not as mature. At the same time, we had the NoSQL database providers, also, not in the cloud I might add. The choices were vast and there are many trade-offs to be talked about, like do I install this myself? Do I run this myself? Is it in a data center, my own data center or run it on a cloud? Similarly on the relational, on the admin side in the cloud there are also lots of questions about is this mature enough and things like that.

Since then, the cloud providers have come up with a bunch of options that you can talk about either in the relational world or in the non-relational world. If you look at Azure there's Cosmos available in column store or in graph store and kind of stuff. On the AWS side, we have Aurora and Redshift different types of databases available Neptune, different types of databases available, while the newer scale providers have also come up with database as a service. Like if you want a graph, you can do Aurora, I think that's what it's called available now as a database, as a service, and same thing with MongoDB, Atlas.

Similarly, we also have for larger data sets nowadays, we have Snowflake available for us and things like that. These choices, one increases the number of choices, like I say. At the same time, I think people are also thinking about the trade-offs in a much different way. I can get the same data scalability and things like that, and it is, comes as a service, so the default choice now, if you're doing something new is go to a cloud provider. If you want SQL or no scale or non-relational, that's a little bit design choice and things like that.

Like what do I want to store like Martin saving? Is it a key-value store or do I just want to store a document? Or just use yes, three as a storage layer and put processing engine on top. All these choices are making it easier to think about what is it that we want to do. The other thing also available nowadays is much more resilient SQL store, like CockroachDB is a very good example of that, that you could think of much more resilient even in like AWS, you could use my SQL with hyperscale or like, I think globally available tables and things like that, where you can say, if I use this database, I want this table to be globally available and they take care of all the distribution, and all that kinda stuff.

That gives many more choices, even though you stick with a given storage pattern, or a given relational model or non-relational model. I would say the choices have increased enormously, but at the same time, people are thinking within the same client product. Like if I'm using AWS for all of my needs, I'm probably sticking within the AWS ecosystem to get all my worked up. Probably that reduces the number of choices, but I also make design choices a little easier also.

Zhamak: With that diversity, of course, there's complexity making decisions, you have more choices, you're paralyzed, making wish choice. I wonder what's going in the shareability of data and interoperability. There's one decision that we talked about this in microservices that you choose to storage of your choice, to put the data in that suits the structure and model of your data, and then expose APIs.

The APIs would allow sharing the data, you don't have to integrate through the database. I wonder, are we getting better in exposing and externalizing data and sharing data across these very diverse modeling? Or are we actually in trouble and end up converging to one modeling because it's just so hard to share different models of data? What have your experience been on the impact of diversity and Polyglot storage on data sharing and some of the trends perhaps that we're seeing.

Pramod: Sure. Nowadays, whenever you think about platforms or people that are building platforms, are talking about these kinds of stuff, there's usually a lot of thought on like, what is my data catalog? How do I want to build a data catalog? What are my APS standard? How am I going to share the data? Is it like JSON, Avro or some other format? There's a lot of upfront thought being put into how do I want to share data? What are the standards around that?

Like Martin famously says, just because it's a schemaless database, doesn't mean there is no schema, the schema is there everywhere it have to figure out how do I share that schema, and come up with standards of those things. I see like in the architecture or in the architects world, a lot of that being put into how do I share this? How do I create standards? How do I expose this data? Things like that.

I think there is work going on, at the same time, some of it is not mature and it's very easy to fall into the trap of, I build my own stuff, and then other teams have no idea how to figure out what is the data that I'm giving out, what is the format, and how are the aggregates? What are the aggregates at? What level they are the aggregates and things like that. There is still a lot of work to be done.

I think at the same time, people are trying to put the effort to create data catalogs, make it easier to share data, make it available, or even create events when things happen so that other consumers can consume the events without having to go and ask, stuff like that's happen.

Zhamak: It looks like there is a trend as we move toward using data beyond operational of transactional application and using data for analytics and training machine learning models, data sharing becomes more and more important beyond your operational RESTful APIs. As you said, enabling the discoverability of the data and also upfront thinking about the standardized way of sharing data, and some of these standards are being adopted like the parquet and Avros and formats like that, but they're definitely there is a space that we can definitely do better.

Pramod: Yes. Especially the duality of the operational data storage stores the data and does things, but it's like a state machine, and what analytics needs is every change in state. It doesn't need the end state,it needs every change in this state. How do I give that data? Do I keep it inside my API and when someone asks, give me an object and all its state changes, or do I tell someone that, "Hey, this object changes this state and I only keep the latest state." That's a decision someone has to make, and that decision then leads down to the path of like, "Does the API keep the history or does the API event out history and always keep the current copy?"

That I think is a good point to bring out about design, trade-offs of where do I make the decision, and that decision probably leads you to think about the concept of object history, event history and stuff like that, being maintained in the operation store, or the operational system or the operational system offloading that to some other place, and that system keeps track of all the other places, all the other things associated with that option.

Rebecca: One of the other things that became much more of a topic of conversation, at least, as we looked at these different persistence models, was this whole notion of eventual consistency. With relational databases, you didn't have to talk about eventual consistency. It was either there or not. I wonder if you can talk a little bit about how these more nuanced conversations about eventual consistency and in particular, where you have some aspects of your application that, in fact, do have to have the hard transactional boundaries, whereas others can deal more readily with this state of eventual consistency. Have we gotten better at dealing with that, or are people still afraid of this idea of eventual consistency?

Pramod: There are still discussions that do happen that when we talk about eventual consistency, people are afraid of making the choice back to the product people or making the choice back to the business that, "Hey, five minutes." This may not be consistent. Some teams are afraid of making that statement, but having said that there is like nowadays you can see higher usages of search engines, for example, like elastic search or solar backed search engines, and people are okay that if I create something and immediately search, it may not show up or things like that.

There is eventually a little bit of acceptance, I would say, of the fact that stuff may not show up on time or things like that. One of the good things I heard somewhere that even humans deal with eventual consistency all the time. Not just computer systems, humans deal with them all the time. It's easier to convince people with human examples instead of just talking about system examples. That's what I think we should be doing as architects to convince people that humans also are eventually consistent ultimately and that I think is a good thing.

Martin: Yes. I've always argued the trade-off between consistency and responsiveness because it's usually that, is a business decision, and run many situations where you've got to say, "Okay, how do your business want to respond? Do you want to allow little inconsistency in order that you can take that booking of a hotel room, even though it might be the last hotel room and it gets double booked, do you want to accept it anyway and accept, and assume will deal with it like later or do you not?"

These are often business choices, not technical choices, and we have to deal with that as we deal with everything else, better communication through the business side. In many ways, the eventual consistency has always been a feature Of businesses, it's just not been acknowledged so much because it's been something outside of the realms of the data. Now, as IT spread to itself more deeply into every part of a business operation, we can't ignore that trade-off.

Zhamak: I know in your book, you briefly touched on different modes of storage and different modes of modeling data. One of those is graph databases, which is very close to my heart. I love graph modeling. Can you share what your experiences been with graph databases, their application, from modes of using graph or articulating graph and guiding the audience way to apply it?

Pramod: Sure. Graph databases are, in some ways, a little harder to understand. How do you model, like, is everything a node or is everything an edge or should I put properties on edge or is the property a node? It's a tricky thing to model and it takes some learning, the learning curve to graph modeling is a little difficult. In the beginning, you may say, oh, all the higher-level entities are nodes and all the relationships between those things are edges.

Then at some point later you figure out that, oh, I need properties on these edges. Then people start putting properties on the edges itself and which is a little cumbersome and we doesn't carry enough query ability from the graph side. Then eventually, when you mature, you'll start thinking about, oh, do those properties on the edges, do they need to become nodes? Things like that.

I would encourage people to do like 1 or 2, sample applications or POCs to figure out how much maturity you have in the modeling. Many a times once you get to a mature state, everything starts looking like a graph. That's a trick with graph databases is everything starts looking like a graph. Then at that point, you literally have to stop yourself at some point like, I don't want to convert my financial application into a graph database.

I think there are many ways. One of the good ways I generally think about is, do I care about the interactions between the entities? Do I care about how do these entities interrelate in their interactions? Customer A made a payment to customer B and customer B also got a payment from customer C and they were for products, XYZ, whatever they're. If you care about the interactions happening here, like what kind of product did customer A buy, what kind of product did customer B buy, and what are the relationships? Maybe it's timing, maybe it's location. If you are interested in that relationships, it's good to think about that in graph terms.

If you don't care about it, if you only care about how many customers bought car product A then probably you don't care about a graph. You have to think about those relationships between the entities and do you care about those relationships? Then I think graphs become more powerful because you can traverse on a graph, you can query the relationships and things like that.

For the relational databases, even though it says relational databases, this is one of the things we talk about in the book. Even in talks is when we say relational databases, the relations are not explicit. You cannot perverse relationships in a relational database, they are created and you have to join them to travel on them. You can't make queries on those relationships as such. Think about those when you are trying to think about graph database.

Rebecca: Since the book was published, we've talked a little bit about how the additional cloud database technologies have come out. For somebody just getting into this, what's your advice on how to make sense of the landscape? Where should people start in thinking about getting a handle on the diversity of approaches for persistence and for data modeling and for querying? Where do you begin?

Pramod: Oh, that's a tough question, Rebecca. I would start at the basics. I think in the book we also mentioned in the beginning is talk about like, think about dorm and design. What are your aggregates? How are you gonna read these aggregates? How are you gonna write these aggregates and things like that. Sometimes that will lead you to do I need a key-value, do I need a document, do I need a column store, do I need a graph or do I need a time series and things like that.

Once you answer that question, then you should probably go to what product do I use for this particular storage model? If you arrive at the answer that I need a document database, then you can say, oh, should I need MongoDB? Or do I need Azure Cosmos? Or do I need ArangoDB, or do I need whatever right? I would go approach it in that step. Architecturally, what am I saving or what am I trying to process? How am I going to query that persisted object?

What other ways sometimes people miss that, oh I can use the key-value store here and later on, they say, oh, how do I index this value inside the key, so I can query on the value part. Then probably misread the whole aggregate concept there and you probably need to rethink now, what kind of aggregate you need? Figure out what kind of model you need first, and then go from there. Many a time people are already on some cloud provider so maybe start your search within the cloud part of us and if it doesn't give you what you want then expand it further further. Martin, do you have any better perspective?

Martin: No I would save money with the same understand what the shape of the data is and whether that leads you into one of the structures that you mentioned. Understand what the access patterns are, how many people are reading it, how many people are writing it, what kind of demand you need in it, does that affects both your choice of the model, but also obviously, the technologies.

Understand your technologies in the platform that you are, if you're on a particular cloud platform, understand what's there, and making sure that it was the best you have. Again, if you hide your data storage behind APIs, you've got a relatively good range of flexibility that you can use to cope with changes should you need to make changes later on and also to use different data models for different purposes.

You can store data in, say, aggregate style database, like a document database, but then expose some of that data that's in the document as tabular data for when people need to manipulate the tabular side and then in order to find out which documents they need to look at. There's a lot of power of tables, people like tables. You can see the way people will relentlessly fit into spreadsheets, things that should never go inside spreadsheets. Just because tables are a natural way that people can see things. That's one of the advantages that the relational model has, is that the table is a way in which people naturally think of to look at things.

Rebecca: Well, thank you both, Pramod, Martin, for joining us again. Thank you, Zhamak, for the insightful questions. I hope you've all enjoyed this discussion as much as we've enjoyed having this discussion. Thank you all.

View less

More episodes

Episode name

Published

Why the tech industry needs Expert Generalists

July 10, 2025

The three new fallacies of distributed computing

June 26, 2025

MCP and SRE: Why the future of IT operations is agent-driven

June 12, 2025

Unpacking Google I/O 2025

May 29, 2025

Accelerating mainframe modernization using generative AI

May 15, 2025

Exploring the fundamentals of software engineering

May 01, 2025

Themes in Technology Radar Vol.32

April 17, 2025

We need to talk about vibe coding

April 02, 2025

Infrastructure as code in 2025

March 20, 2025

How fitness functions can help us govern and measure AI

March 06, 2025

Architecture as code

February 19, 2025

Decoding DeepSeek

February 06, 2025

AI testing, benchmarks and evals

January 23, 2025

Exploring the intersections of software architecture

January 09, 2025

Who should make software architecture decisions?

December 26, 2024

Generative AI's uncanny valley: Problem or opportunity?

December 12, 2024

Using generative AI for legacy modernization

November 28, 2024

Data contracts: What are they and why do they matter?

November 14, 2024

Themes from Technology Radar Vol.31

October 17, 2024

Build Your Own Radar: Using the Technology Radar as a governance tool

October 03, 2024

Exploring DuckDB: A relational database built for online analytical processing

September 19, 2024

Software service granularity: Getting it right

September 05, 2024

Measuring developer experience

August 22, 2024

How can AI support designers?

August 08, 2024

Sensible defaults: A way to think about our technology practices

July 25, 2024

Tracking technology stacks, practices and experiences across teams

July 11, 2024

Inside Bahmni: An open-source digital public good

June 27, 2024

How to assess your organization's security maturity

June 13, 2024

Continuous delivery vs. continuous deployment: What should be the default?

May 30, 2024

Themes from Technology Radar Vol.30

May 16, 2024

Building at the intersection of machine learning and software engineering

May 02, 2024

Refactoring with AI

April 18, 2024

How to measure your cloud carbon footprint

April 04, 2024

Technology through the Looking Glass: Preparing for 2024 and beyond

March 21, 2024

Diving head first into software architecture

March 07, 2024

Exploring the building blocks of distributed systems

February 22, 2024

Software-defined vehicles: The future of the automotive industry?

February 08, 2024

Beyond the DORA metrics: Measuring engineering excellence

January 25, 2024

Asynchronous collaboration: Getting it right

January 11, 2024

Looking back at key themes across technology in 2023

December 28, 2023

Leveraging generative AI at Bosch

December 14, 2023

Jugalbandi: Building with AI for social impact

November 30, 2023

AI-assisted coding: Experiences and perspectives

November 16, 2023

What's it like to maintain an award-winning open source tool?

November 02, 2023

Engineering platforms and golden paths: Building better developer experiences

October 19, 2023

Managing cost efficiency at scale-ups

October 03, 2023

Exploring SQL and ETL

September 21, 2023

Driving innovation in radio astronomy

September 07, 2023

XR with impact: Building experiences that drive business value

August 24, 2023

Leadership styles in technology teams

August 10, 2023

Making design matter in technology organizations

July 27, 2023

Generative AI and the future of knowledge work

July 13, 2023

Scaling mobile delivery

June 29, 2023

Making privacy a first-class citizen in data science

June 15, 2023

Multi-cloud: Exploring the challenges and opportunities

June 01, 2023

Scaling up at Etsy

May 18, 2023

TinyML: Bringing machine learning to the edge

May 04, 2023

The weaponization of complexity

April 20, 2023

How we put together the Technology Radar

April 06, 2023

Inside India's Drug Discovery Hackathon

March 23, 2023

Serverless in 2023

March 09, 2023

My Thoughtworks journey: Rebecca Parsons

February 23, 2023

How to tackle friction between product and engineering in scale-ups

February 09, 2023

6 key technology trends for 2023

January 26, 2023

Tackling system complexity with domain-driven design

January 12, 2023

Shifting left on accessibility

December 29, 2022

Data Mesh revisited

December 15, 2022

Low-code/no-code platforms: The 10% trap and the limits of abstractions

December 01, 2022

Welcome to the fediverse: Exploring Mastodon, ActivityPub and beyond [Special]

November 24, 2022

Rethinking software governance: Reflecting on the second edition of Building Evolutionary Architectures

November 17, 2022

Reckoning with the force of Conway's Law

November 03, 2022

Exploring the Basal Cost of software

October 20, 2022

Why full-stack testing matters

October 05, 2022

Acknowledging and addressing technical debt in startups and scale-ups

September 22, 2022

XR in practice: the engineering challenges of extending reality

September 08, 2022

Agent-based modelling for epidemiology: EpiRust and BharatSim

August 19, 2022

Mastering architectural metrics

August 12, 2022

Building a culture of innovation

July 28, 2022

Starting out with sensible default practices

July 14, 2022

Better testing through mutations

June 30, 2022

Patterns of legacy displacement — Part two

June 16, 2022

Patterns of legacy displacement — Part one

June 02, 2022

Mitigating cognitive bias when coding

May 19, 2022

Following an usual career path: from dev to CEO

May 05, 2022

Software engineering with Dave Farley

April 21, 2022

Tackling bottlenecks at scale-ups

April 07, 2022

Coding lessons from the pandemic

March 24, 2022

Is there ever a good time for a code freeze?

March 10, 2022

Navigating the perils of multicloud

February 25, 2022

Compliance as a product

February 10, 2022

The big five tech trends for 2022

January 27, 2022

Fluent Python revisited

January 13, 2022

Creating a developer platform for a networked-enabled organization

December 30, 2021

The art of Lean inceptions

December 16, 2021

The hard parts of data architecture

December 02, 2021

TDD for today

November 18, 2021

You can't buy integration

November 04, 2021

The rise of NoSQL

October 21, 2021

The hard parts of software architecture

October 07, 2021

Machine learning in the wild

September 24, 2021

Delivering innovation at scale

September 09, 2021

Jim Highsmith: a 54-year agile journey

August 26, 2021

Securing the software supply chain

August 12, 2021

Making retrospectives effective — and fun

July 22, 2021

Patterns of distributed systems

July 08, 2021

Refactoring databases — or evolutionary database design

June 24, 2021

Making developer effectiveness a reality

June 10, 2021

Team topologies and effective software delivery

May 20, 2021

How green is your cloud?

May 07, 2021

Green software engineering

April 22, 2021

Twenty years of agile

April 08, 2021

Talking with tech leads with Pat Kua

March 25, 2021

My Thoughtworks Journey: Patricia Mandarino

March 11, 2021

Exploring infrastructure as code

February 25, 2021

XR in the enterprise

February 11, 2021

Getting to grips with data visualization

January 21, 2021

Computational notebooks: the benefits and pitfalls

January 07, 2021

The architect elevator

December 24, 2020

The future of Clojure

December 10, 2020

The future of digital trust

November 27, 2020

Integration challenges in an ERP-heavy world — Pt 2

November 12, 2020

Democratizing programming

October 28, 2020

Integration challenges in an ERP-heavy world

October 16, 2020

Models of open sourcing software

October 01, 2020

Applying software engineering practices to data science

September 17, 2020

Using visualization tools to understand large polyglot code bases

September 03, 2020

Machine learning in astrophysics

August 20, 2020

Programming languages geek out

August 06, 2020

Observability does not equal monitoring

July 23, 2020

Working with 50% of code in the browser

July 09, 2020

Realising the full potential of CD

June 25, 2020

Testing the user journey

June 12, 2020

Continuous delivery in the wild

June 01, 2020

Lessons from a remote Tech Radar

May 13, 2020

The future of Python

April 30, 2020

A sensible approach to multi-cloud

April 17, 2020

Digital transformation: a tech perspective

April 02, 2020

IT delivery in unusual circumstances

March 20, 2020

Continuous delivery for today's enterprise

March 06, 2020

Fundamentals of Software Architecture

February 21, 2020

Cloud migration — part two

February 10, 2020

The price of reuse

January 24, 2020

Towards self-serve infrastructure

January 13, 2020

Martin Fowler: my Thoughtworks journey

December 27, 2019

Building an autonomous drone

December 13, 2019

Cloud migration is a journey not a destination

November 28, 2019

Getting to grips with functional programming

November 14, 2019

Compliance as code

November 01, 2019

Data meshes: a distributed domain-oriented data platform

October 18, 2019

Edge — a guide to value-driven digital transformation

October 04, 2019

Tech choices: CIO or CTO?

September 20, 2019

Microservices as complex adaptive systems

September 05, 2019

Supporting the Citizen Developer

August 22, 2019

Getting hands-on with RESTful web services

August 08, 2019

Zhong Tai: innovation in enterprise platforms from China

July 25, 2019

What’s so cool about micro frontends?

July 11, 2019

Unravelling the monoglot monopoly

June 27, 2019

Breaking down the barriers to innovation

June 13, 2019

Delivering strategic architectural transformation

May 30, 2019

Exploring programming languages via paradigms vs labels

May 16, 2019

Multicloud in a regulated environment

May 03, 2019

Can DevSecOps help secure the enterprise?

April 18, 2019

A11Y — Making web accessibility easier

April 04, 2019

Continuous delivery for modern architectures

March 21, 2019

Delivering developer value through platform thinking

March 07, 2019

Architectural governance: rethinking the Department of ‘No’

February 21, 2019

Serendipitous Events

February 08, 2019

Diving into serverless architecture

January 24, 2019

Seismic Shifts

January 10, 2019

Understanding bias in algorithmic systems

December 28, 2018

Microservices: The State of the Art

December 14, 2018

Evolving Interactions

November 29, 2018

The state of API design

November 15, 2018

How we build the Tech Radar

November 01, 2018

IoT Hardware

October 18, 2018

Continuous Intelligence

October 04, 2018

Distributed systems antipatterns

September 13, 2018

Agile Data Science

August 23, 2018

Solutions

Industries

Resource Hubs

Publications and Tools

All Insights

The rise of NoSQL

Brief summary

Full transcript

Explore the latest Technology Radar