Rebecca Parsons: Hello everyone. My name is Rebecca Parsons, and I'm one of your recurring co-hosts on the Thoughtworks Technology Podcast. I'm here today with Neal Ford, who was not here as a co-host but as a guest, as well as Mark Richards, we have a podcast from a while ago where I spoke to the two of them about their book, The Fundamentals of Software Architecture. They're here today to talk to us about their forthcoming book on Architecture: The Hard Parts.
Neal Ford: Wow, the guest chairs in the studio are much more comfortable than the podcaster chairs, so this is really a treat. Thanks for having us.
Mark Richards: Thank you so much, Rebecca.
Rebecca: The first question that came to my mind when I heard Neal start talking about this book is first, the title, The Hard Parts. You just finished a really comprehensive book on the fundamentals of software architecture. What more could there be? What prompted you all to do yet another book talking about architecture, and why scope it to the hard parts?
Neal: It's a great question, and a great obvious first question. As we were writing the fundamentals book we kept running into these problems that we would start diving into and then realized, "Oh no, this is a rabbit hole. This is going to take forever to get into all of the different possibilities and trade-offs in this." We started putting those aside and saying we'll do that in some future book or something. Mark is actually credited with coming up with this name because he said at one point, "Hey, I know what we should call the next book. It's Architecture: The Hard Parts," because it's all these difficult problems that we've set aside from the fundamentals book.
When we got the fundamentals book done, we started looking at this pile of difficult children that we had built up over time and tried to figure out, what is making these things so difficult? What is the thing that makes architecture hard? If we're going to call it the hard parts. We realized that all of these things that we're having such a hard time talking about in a simple, fundamental way is because there were so many trade-offs associated with them. It wasn't a simple linear decision path, it was you get to here and you have to make this trade-off decision, and here and make this trade-off decision.
That's when we realized that, oh this is not just a collection of things in distributed architectures. It's a book about trade-off analysis and that's the subtitle of the book, is Trade-Off Analysis for Distributed Architectures. That's where we thought we could add some actual insight into distributed architectures is, how do you do trade-off analysis in modern architectures? There are a lot of older techniques that don't work for various reasons, and a lot of teams struggle with this, particularly in microservices. There's some really fundamental decisions that you have to make that are nuanced and have a lot of trade-offs. We wanted to show people how to do that kind of trade-off analysis.
Mark: The other interesting thing, Rebecca, about this was we kept adding more and more information onto The Fundamentals of Software Architecture and that pretty quickly got to around 425 or close to 430 pages. We wanted to talk about all these other aspects of architecture in that book, but there just simply wasn't room. One of the things we realized for example, let's talk about distributed transactions, and it wouldn't really do justice just to spend maybe one or two pages on it.
This book really allowed us to take all of those things we weren't able to talk about in the fundamentals book and really talk about them, but more importantly go really deep into each of these topics, and so it's a such an excellent follow on book as next step to say, "Okay, now that I understand the fundamentals, let's go deeper." [chuckles]
Neal: In fact in the fundamentals book we had a chapter on microservices, which is hard enough because there are entire books written about microservices and software architectures and hard parts, in many ways is about just two aspects of microservice architecture. It is digging in a layer upon layer upon layer of these complex problems. It'd be hard to do in a single book.
Rebecca: Can you talk a little bit more about how the way you approach trade-offs has changed over over time? I can imagine an intellectual framework that said, "This trade-off analysis to a certain level of abstraction, is the same one that we have been dealing with for forever." How is it that this trade-off analysis really has evolved over time?
Neal: There were a lot of attempts to make formal trade-off analysis methods in architecture, ATAM and CBAM and a bunch of other- interestingly mostly 4 letter acronyms rather than 3 letter acronyms. I think because they're that much more complex than the 3 letter acronym things. [laughs] There were a lot of attempts to do that, but they were all very heavyweight, very bureaucratic, very time intensive and a lot of meetings for people that you could never actually really get to meet together, but the aspirational goal was--
The problem you have in modern architectures is not so much the trade-offs that you were trying to arrive at, which were these holistic enterprise architecture level trade-offs across an entire organization. That's what those big, heavyweight frameworks are for. We were talking about on the ground trade-offs for, huh, I have this microservice architecture, should I do orchestration or choreography as a communication style? I've had conversations with a lot of architects who say, "Oh, you should always use choreography for everything." As an architect, the always word makes my Spidey sense go off that, oh, no always is, oh, that's sketchy.
That's a great example of the kind of trade-off you have to make, but the fundamental problem we have is that- and this is a quote from Mark, you can't Google architecture. When you're a developer, you get really good at Googling answers to questions about configuration and why this is not working and error stacks and which part of the stack to paste into Google to get the best results, but when you're an architect, first of all, what are the chances that the exact combination of forces that you're trying to decide about has ever existed on planet Earth before with the combination of technologies and legacy and all that other stuff?
Even if it has, what are the chances somebody has solved your problem and written a blog post about it that you can find in search on Google? Every trade-off analysis has to be ad hoc now. We actually lay out an incredibly simple framework, which is- and I can tell it to you now out, figure out how interested things are tangled up together. Figure out how changing one of those would affect that tangle and build an analysis from that. Rather than just give you that very simple formula, the entire book is really taking the generic- the two basic major generic problems in distributed architectures, granularity of services and communication, and using this trade-off analysis to do a deep dive into all the different facets of those things.
The wiring of your services, what size should the services be, and how should they call one another? Of course, part of that wiring are a whole bunch of considerations about data and data relationships and schemas, because in modern distributor architectures, of course, you have data as part of a bounded context which makes a lot of those data concerns projected into architecture.
Mark: One of the other things I think as well, Rebecca, that does distinguish this from the older trade-off methods and really introduces a new approach is we really emphasize the fact that in each chapter we build trade-off tables to show the trade-offs of one or the other. However, unlike a traditional scorecard that a lot of people use to say, "Well, this one scores higher than this one so that must be a best practice, or that must be the one I want choose." Rather we introduce those kind of trade-offs, but then put them into a particular context.
Granularity is a great example, Neal, that you just mentioned because for example in that chapter, we go pretty deep on how big should a service be. With there, to give you an example, we talk about disintegration drivers. In other words, all those various drivers and justifications and factors for actually breaking apart a service. Now, people get pretty excited about that part of the book, especially since like our fundamentals book, we've vetted out a lot of this material through conferences and trainings and such to really get good feedback.
People get very excited about those disintegration drivers and say, "Thank you, thank you so much." Code volatility or whether it be scalability or whether it be security, these are great reasons to break apart a service. We actually take it one step further and say, "Yes, but," there's the other side of the coin. There's the granularity integrators that say, well maybe you should keep a service bigger, more coarse grained.
What we really try to strive for in that particular chapter about service granularity is the balance between these two kinds of forces, disintegration drivers and justifications and integration drivers and justifications, and really put it back on the architect, the reader to say, "Thank you. Now I have these opposing forces and I can determine which is more important to me. To isolate code volatility, to make it easier to test and deploy, or transactionality," because we can't have both. Those are examples of how we really changed the game with respect to how to see, identify and actually determine what trade-off should we accept?
Rebecca: I remember there was a very short period of time-- Thoughtworks has this internal software dev mailing list, and at the same time, there was this long thread on, we'd been brought in because they've got 87 microservices and we need to have fewer. There was at the same time another thread, we've been brought in because they have three microservices and they need to have more.
[chuckles] Of course, the specific numbers don't matter, but there are, as you point out, pressures to say, "Actually, no, I need to have a smaller number. I need the granularity to be larger because of this, this or this," or, "Wait a minute, this is far too constrained because of how big these things are, so I do need to disentangle them." Unfortunately in our industry, we keep wanting-- Yes, the perfect number is 12. Obviously, the perfect number does not exist, but it is great that you can talk about, "Okay, here are these different forces, these different drivers. Now, for my organization, for this application at this particular point in time, where should that balance be?"
This is, I think something- the perfect granularity A, doesn't exist, but B, is specific to a point in time. There is nothing that says, even after you very carefully made these trade-offs in nine months' time, you might have to make a different trade-off because these factors have shifted.
Neal: In fact, the common answer to your question is, it depends. What we're trying to do in this book is answer the follow-up question depends on what. Because every time it depends comes up, you can ask the following question, "Well, what does it depend on?" That really is, I think one of the insights that we got from the first book, one of the things in the first book that we tried to do was look at things that seem to apply universally across everything we touched, and we codified those as these two laws. First one is everything in software architecture is a trade-off.
Of course, we doubled down on that in this book, but it really is amazing how much, when you really start pulling on that thread of trade-offs in architecture things, that how much insight you get into how much these things really are trade-offs. Once you've accepted that there are trade-offs, how do you figure out exactly what those trade-offs are? We do a lot of qualitative analysis in this book because it's hard to do quantitative analysis in architecture because you have to have things that you can quantitatively compare, which is really difficult, certainly across architectures, but even within one, but you can do qualitative analysis.
You can say, "If we switched from orchestration to choreography in this case, will this make scalability better or worse?" That is the key to let you do ad hoc trade-off analysis for given situations, for all the forces that are driving you forward in this situation, versus another one, versus another one, because there are no silver bullets in the software architecture world. There is no generic advice that's useful so you have to dig in to- depends on what to get to useful things.
Mark: I think there's also a unique aspect to further answer your question, Rebecca, about how we treat trade-offs in the book and trade-off analysis. Like most things, everything seems to have an anti-pattern or pitfalls associated with it and trade-off analysis is not exempt from that. There are pitfalls and anti-patterns in analyzing trade-offs. As far as I know, ours is the only book that actually addresses those particular anti-patterns and trade-offs. For example, the out-of-context problem is very common when analyzing trade-offs, and that scorecard piece I talked about before. Looking at pros and cons and seeing that there's 20 pros and only 2 cons, so we should do this.
That is a pitfall within analyzing trade-offs and making decisions because it's out of context, as Neal was saying, depends on what related to us. Things like over Evangelism of particular solutions is another trade-off anti-pattern that we get too excited about something and we start- just our brains don't see the negatives because we really like this particular platform or this particular technique or tool. Those are the kind of things that we talk about in the book. There's quite a few of those anti-patterns within trade-off analysis that we cover. I think that also does distinguish how we see and view trade-off analysis today versus yesterday.
Neal: Well, that's one of the things that we very purposely did in the book is the first 14 chapters are written purely in the third person, because we're describing architecture and how to do trade-off analysis. The last chapter, chapter 15 is written in the second person and it's entitled build your own trade-off analysis. We use the 14 chapters before as examples of how to do trade-off analysis, but then give you a bunch of concrete advice as the reader for how to apply the things that we're talking about to your own problems, how to build these qualitative lists, how to build these matrices, and that sort of stuff.
We try to use the examples in the book as a way to get you, the reader, to the point where they can do their own because that's the real value of an architect is being able to do trade-off analysis within your architecture, not for some abstract architecture that exists on Google or on the pages of books somewhere.
Rebecca: We've spent a lot of time talking about one of those two challenges of distributed architectures that you mentioned, which was granularity. What can you say about the communication challenges? Obviously, an architecture diagram that has only boxes is not terribly interesting. You need to know how the boxes interact with each other, but what can you tell me about some of the specific challenges you looked at, with respect to communications?
Neal: We relied on a book that was written back in 1993, called What Every Programmer Should Know About Object-Oriented Design. One of the great insights in that book was the separation of coupling between static and dynamic coupling. Static is the way things are wired together, and dynamic is how things call one another. There are two fundamentally different ways of thinking about architecture, because think about, for example, synchronous versus asynchronous communication.
If you have two independent services to have independent operational architecture characteristics like independent scalability, well, if I make a synchronous call from one of those to the other, I tangle up those characteristics for the course of that call. I don't have independent scalability anymore during that synchronous call. Once that call is over, they're back to independence again, but that entangles them temporarily. Whereas an asynchronous call doesn't have that same effect because you can use asynchronous message cues for buffers and have multiple listeners, et cetera.
There's a fundamental difference there in the way you think about things in architecture, whether it's synchronous versus asynchronous. Communication also has a bunch of decision points within architecture. This is something that we struggled with for a long time. It was how to reconcile all these different forces in architecture. One of the very late insights in this book was thinking about these things as a conjoined three-dimensional space that you can't think about them independently.
The insight here was okay, thinking about transactionality in microservices, you can't just treat that a la carte, because if you change transactionality from eventual consistency to atomic, then it changes the trade-offs in your workflow. Should it be orchestration or choreography because now you've got a different set of trade-offs in your workflow. You have a different set of trade-offs for a lot of things. Flipping those binary switches changes. All these things are entangled, and this is one of our insights.
If you're trying to decide something about communication in these architectures, you have to look at how all the things are entangled and can actually build matrices to say, "If I flip this, what does that impact? Does this have over here and let you do trade-off analysis without having to build every one of them from scratch?"
Mark: Isn't it funny, Neal, do you remember back in the day at conferences you had a very famous analogy that people loved about exactly what you just talking about, and it's actually basically like flying a helicopter. Do you remember that?
Neal: Oh, yes.
Mark: The different controls and axes all impact one another. It's just a perfect fit to this entanglement of all of the aspects of eventual consistency or choreography, and then correspondingly communication. It's almost like the three controls at a helicopter. [laughs] It's funny, Rebecca, when you first asked the question about communication, I had a different mindset, I actually was not thinking about interservice communication protocols. I was thinking about how we communicate our ideas to others. I'm sitting here going "Well, there's a lot of that in our book, but that's more in our first book," but I said- so it's interesting, that association.
However I do think we do touch on that, because it comes back to that trade-off analysis and justification. That requires collaboration and communication with all levels of the organization, all different kinds of stakeholders. Whether it be in the database area, whether it be in operations, whether it be in release management, any of those aspects. That's what I was tying the communication piece and when Neal started talking about sync async I'm like, "Neal, what in the world are you talking about?" Then I realized "Oh, communication protocols." [laughs]
Neal: One of the things that we do in our book is, the saga is the typical name for a transactional workflow within a distributed architecture like this. We have both the figurative saga but we have a literal saga inside our book as well. One of the things that we struggle with because a lot of this material is pretty abstract- and you've written a book about architecture, it's really easy to get really abstract and hard to make it concrete. We realized that this book has a lot of very abstract stuff.
Part of the benefit of teaching this material a lot, we've been teaching this material continuously even through the pandemic, we've been doing online classes and that kind of stuff is that it lets you test out to see how to make things more concrete. During one of our training classes, we came upon this idea of creating this- you're familiar with this idea of architectural katas, these little made up problems that you can solve architectures around. We came up with this increasingly elaborate [laughs] kata around this mythical company called Penultimate Electronics, and the Sysops Squad which is basically, they run around and you hire them to come fix your TV or your computer or whatever.
Throughout the Architecture: The Hard Parts book we have embedded inside every chapter Sysops Squad stories. What that allowed us to do is really have two different levels of abstraction in the book, because we have the exposition abstraction which is the details and the technical, all the details you need an exposition but then it drops into the Sysops Squad and they have a very concrete example they're trying to solve, because they've got a particular problem at the beginning of the book that they're trying to solve. They have concrete problems.
They're applying the things that we've talked about in exposition to solve real problems. This is a problem we faced early on where this book could either be 200 pages, 450 pages or 25,000 pages, because if it was just purely abstract it would be 200 pages and be unreadable, because nobody would be able to read more than two chapters without their heads exploding. At 450 pages, it allows us to selectively pick examples and exemplars to illustrate some of the things without going to every possible branch of the trade-offs. Then the 25,000 page book is let's take every possibility for every choice and follow them to their fullest extent, which again nobody would ever want to read.
This is the classic show don't tell that allowed us to be selective about being abstract but also show concrete implementations for these specific problems as well. I think it worked out really nicely because in my mind struck a good balance between the abstract and the implementation. Hopefully the readers get involved in the Sysops Squad story and hope- wish them well.
Mark: When Neal is talking about story it actually is a story, complete with dialogue and characters. One of the I guess, advantages that both Neal and I have of- well, being quite a few many years in the industry mine going on 36. I think Neal is little more than that, but close, is the fact that we through our consulting experience have seen a lot of scenarios, and it was amazing as we're writing this dialogue, I was essentially pulling actual meetings and such and actual experiences that I had all the way past in my consulting career. It made writing that dialogue a lot easier, and also it made it more real.
Rebecca: All the names are made up, to protect the innocent.
Neal: I expected many of those stories were cathartic for Mark to write all those things down. I expected many of those stories were cathartic for Mark to write all those things down. Speaking of names, one of the very intentional things that we did in this book, because we realized that there's a long history of accidental misogyny in technical books, so every single character in this book has a gender-neutral name, very much on purpose. They also have alliterative names for their role. All the architects have A names. We have Austin, Addison, Logan is the lead architect. That's an L name, and there's not a single gendered pronoun in the entire book.
You will appreciate this approach, Rebecca. I actually built a fitness function that would scan the entire manuscript, looking for gendered pronouns to make sure that we didn't accidentally add one at some point during the writing process. We were very careful about that, because it's time. One of the things that we wanted to do is make sure that you could write a book like that and not use any gendered pronouns and not make it sound and read in a stilted way. I don't think it does at all. I don't think you notice the absence of this thing. That was actually a little bit of a writing experiment that turned out really super well.
There are a couple of other little Easter eggs, in the Sysops Squad story stuff. One of them is the gendered pronouns and the names, but another is we wanted to come up with a way to separate each of the Sysops Squad stories from the main text. We came up with this idea of using timestamps. Every Sysops Squad story starts with a timestamp with a date and a time, but not a year. There's a slightly jumbled chronology of those timestamps. If you pay close attention to them, the very first timestamp of the book happens on September 21st. The last one happens on June 20th.
Of course, that is the equinox. The book begins on the first day of the year where night is longer than day, and it ends on the day of the year where you have the longest daylight. There's a little bit of a time-based thing going on there too. There's one other little Easter egg there that nobody will pick up on. The very first timestamp in the book is for 1,300 hours. If you remember the book 1984 very strikingly starts out with saying, "It was a cold day in April and the clock struck 13," which indicated something was very wrong in that world. The very first timestamp is 1,300 hours in the-- [laughs] There was a lot of little things like that scattered throughout the book.
Rebecca: Neal has asserted in the past that it takes two years to write a book. [laughs]
Neal: It always does. Well now you see why, because you can't stop fiddling with stuff like that.
Rebecca: One might also comment on the fact that perhaps you're a little bit of a science fiction geek, but far be it for me to make that sound like that's a bad thing because I'm a self-proclaimed geek. [chuckles]
Neal: Well I think I captured my core demographic well with that pretty easily. [chuckles]
Rebecca: One of the things that I particularly enjoyed as we were working on the evolutionary architecture book were some of the things that just- we discovered, the architectural quantum, that whole notion came after we had been talking about this for a long time. Can you tell about some of the discoveries you all made in this particular book?
Mark: One of the ones that stand out most to me from these innovative ideas, was what we came up with from the concept really of transactional sagas. There is a core transactional saga pattern. What Neal quickly realized was there's dimensions, as a matter of fact, three of them that really form a transactional saga which mathematically points to eight possible combinations. I think, wouldn't you agree, Neal, that was probably one of the more innovative aspects within the book, one of many.
Neal: I remember like a lightning bolt when it finally came to me, I was listening to Mark talk about workflows for probably the 50th or 60th time in the training class that we do together. It finally hit in my head that you can't change this without changing the other stuff and realize that these are conjoined in a three-dimensional space. We have an actual three-dimensional diagram there that if you move the ball here, it affects the other parts of the space. That was a really interesting insight.
What was interesting about that to me was I had gotten really close to that insight almost a year before, and then abandoned it as a dead-end and went off in another direction and then eventually came back around to that. It had been percolating in my head for a really long time. Finally listening to Mark talk about it yet one more time was the trigger that finally made it congeal in my head.
Mark: One of the other innovations there which finally we were able to actually document in a book were various component-based decomposition patterns, which I started to develop back almost about six years ago in decomposing monolithic applications to microservices. That chapter is up I think originally, Neal, wasn't it about 65 pages long? Was, and I think still is, the largest chapter in the book. We considered breaking it up, but it's one cohesive thing, but these are approaches, very incremental controlled approaches to basically breaking apart a monolithic application.
I was so happy to finally get that documented somewhere after the years and years of refining those patterns, but also showing these patterns at conferences and really getting feedback and being able also through regular consulting, to actually exercise these patterns and fine-tune them. I think that's another major innovation or innovative part of the book that really addresses that breaking things apart in part one. As a matter of fact, a piece of trivia here, that Neal and I actually-- Actually, Neal saw this during the review was the book really is separated into two parts, unlike our first book, which was three.
It's really about part one, which is decomposition, breaking things apart. Part two is really about putting them back together, interconnection and communication. What Neal observed, which was really cool, drumroll, and I'll let Neal tell this really cool aspect, little trivia.
Neal: The final book manuscript ended up being 450 pages, and part two began on page 223. We almost exactly split it down the middle, and absolutely no effort whatsoever to do that. In fact, that first delineation between pulling things apart, versus putting things back together was purely an exercise in the first time, the very first time I did a talk called Software Architecture: The Hard Parts. All these examples that I- I need to put these together in some sort of cohesive talk, how are these things related to one another?
There were two buckets of, well these are kind of structural and pulling things apart, and these are kind of communications. I came up with that. It actually started as a three-part thing, but then it quickly collapsed into the two parts that are in the book, but it's just amazing and shocking to those of us who wrote the thing that it came out almost exactly half and half.
Rebecca: What was the third part that was there for a little while, the third bucket?
Neal: The third, it was pulling things apart, putting them back together with appropriate patterns, I think was the third part. It was simple parts. That's right. It was simple parts. Exactly. There's ways to put things together without overcomplicating them, but that quickly becomes just putting things back together properly. [laughs]
Mark: Approximate projected page count of that was almost 225 pages. This would have been a book that would be too heavy to carry around. Like Neal said, we would have been just gone all out and gone to 25,000 pages.
Neal: Well, in this book, we expected was going to be shorter than the fundamentals book, but it ended up being 20 pages longer because there's a lot of stuff it turns out as you start pulling on that thread of structuring communication, particularly as you start overlaying the data stuff. That was the other great insight that Mark and I had as we finished up the fundamentals book is that, look, modern architectures you can't treat data architecture and software architecture as separate things anymore. It was a mistake to do it before, but you really can't do it now.
That's why we included Pramod and Zhamak as co-authors of this book, to talk about the twin ideas of operation in analytical data. We won't talk a lot about that subject matter here because there's going to be a separate podcast with him just talking about the data but that was the other insight that we had is you can't really do a book about distributed architectures now without considering the impact that data, both structurally and communication has.
Rebecca: Absolutely. Data used to be that thing off to the side and you just can't think about it that way anymore. I almost hate to ask this question, but what's the next book?
Neal: Well, that we don't know yet, because one of the main ways that we generate new ideas like this is speaking at a lot of conferences, and of course we haven't been able to do that a lot. The Hard Parts came as a very natural extension because we had this massive pile of things, and we didn't exhaust the pile by the way. Mark just out of the camera range that we can see on Zoom now has a giant pile of topics right behind him on distributed architectures and implementation patterns for distributed architectures and things like that, but we've also got a ton of material on enterprise architecture and some topics like that.
That's going to be one of the next tasks that we do, is get together and figure out exactly what the next thing is going to be. We have thought about doing software architecture, the harder parts, and then of course, that would lead to eventually software architecture, the hardest parts, would be an entire book about the non-technical parts of software architecture, soft skills, organization, meetings, because that it turns out is the hardest part of software architecture, but we'd have to write the harder part to get to the logical hardest part. We don't know what exactly that's going to be.
Rebecca: Yes, let's see, two years, that might be a while.
Neal: Exactly. [laughs] We did actually manage to get this one done in slightly less than two years, though. We accelerated a little bit because it was February 2020 to November 2021. Almost two years, just under, but we'd started working on it before the fundamentals book came out, so we cheated a little.
Rebecca: Well and both of your travel schedules were a wee bit disrupted during that time, that might have something to do with being able to do it a bit more quickly. A pleasure as always, Neal and Mark, to talk about software architecture. Thank you so much for spending the time with us again, and when will the book be available?
Neal: By the time this podcast is out, it went to press literally yesterday as we record this, and so the ebook will be available by early October, and the physical book will be available early November.
Neal: Available at all fine booksellers near you. As I tell people, it's an O'Reilly book so you should get two in case you lose one. They make great Christmas gifts and birthday gifts and anniversary gifts. Nothing says loving your spouse like a nice O'Reilly book.
Rebecca: You were born a marketeer, Neal. Thank you all for joining us on the Thoughtworks Technology Podcast.