AI-assisted coding: Experiences and perspectives

Podcast host Prem Chandrasekaran and Neal Ford | Podcast guest Mike Mason and Birgitta Böckeler

November 16, 2023 | 43 min 43 sec

Listen on these platforms

Brief summary

Generative AI appears to be making an impact in a huge range of fields, but one that we're particularly interested in at Thoughtworks is its use in software development.

In recent months, there's been a lot of talk in the industry around issues like whether AI might boost developer productivity and if it can be used for pair programming, but in this episode of the Technology Podcast we try to get beneath the hype to explore the reality of generative AI and software development — how is it actually being used today? What works? And what doesn't?

To dive deeper into all this, Chief of AI Mike Mason and Global Lead for AI-assisted Software Delivery Birgitta Böckeler join hosts Prem Chandrasekaran and Neal Ford, discussing everything from the current tooling to the way GenAI is shaping developer practices and workflows.

Episode transcript

Prem Chandrasekaran: Welcome, everyone. My name is Prem Chandrasekaran. I am one of your regular co-hosts on the Thoughtworks Technology Podcast. I also have Neal Ford with me. Neal, do you want to introduce yourself?

Neal Ford: Indeed, I do. Thanks, Prem. Welcome, everybody, to the podcast. I'm Neal Ford, one of your other regular hosts. Boy, you're going to hear a lot of familiar voices on this podcast because our guests today are normally hosts but they're in the much more luxurious guest chairs today in our palatial podcasting lounge. Today, we are joined by Mike Mason and Birgitta Böckeler. I'll let them introduce themselves.

Birgitta Böckeler: Yes. Lots of nepotism always on our podcast, right!? [chuckles] When hosts become guests. Yes. Hi, everybody. My name is Birgitta Boeckeler. I'm a Technical Principal with Thoughtworks in Berlin, Germany.

Mike Mason: My name's Mike Mason. I am the Chief AI Officer for Thoughtworks.

Neal: That is particularly apropos for this podcast because our topic today is AI-assisted software development. Mike seems like a perfectly good guest for that.

Prem: Definitely so is Birgitta.

Birgitta: Yes. I should have probably mentioned, yes, that I also have a global role at the moment at Thoughtworks where I work in Mike's team to look into exactly this. Mike is the Chief AI Officer and I'm on his team looking specifically into everything related to AI. Of course, the flavor of the day, GenAI, and how to use it for software delivery.

Prem: Thank you, Birgitta. That was great. Let's get started and be honest: tell us what's the latest and greatest here and why is AI-assisted coding such a big deal today or is it even a big deal?

Birgitta: Yes. What's different about this? I think, for me, it's two main things. I like to always look a little bit at other times when we are trying to make ourselves more effective at software delivery through code generation because one of the most obvious and, so far, most widely used forms of using GenAI for software delivery at the moment is code generation, coding assistance in the IDE with tools like GitHub Copilot, Codeium, Tabnine, Codey, all of those many names. [chuckles]

When I think about more traditional code generators where we formally describe a structure of a language, and then, we have a code generator generate code for us, one of the things that's different here obviously is that it's a very formal way of doing this and structured way of doing this.

We always have to take some work to actually describe a higher abstraction of a language and then build a code generator whereas when we use a large language model to help us write code, generate code to suggest code for us, it's a lot more unstructured and informal. We actually write something in natural language either a code comment or a function name, and then, that gets translated. We don't actually have to create the whole structure around that. It's a lot more on the fly and it's a lot more closer to how we actually think as humans.

The other thing that's different, I think, and rather big potential lies is that we're not trying to actually further raise the abstraction level and have less details to deal with. We always try that with code generators, with low-code no-code. We always try to raise the abstraction level, but then, we lose some control and always have to kind of, not always, but in a lot of cases, [chuckles] have to go down the stack again, to have more control again if we need more custom things.

Now, with GenAI coding assistance, we're actually not really trying to raise the abstraction level. We're going to the side and we're using this assistance on all of the levels. We're using large language models to help us create low-code no-code applications. We're using it to create Java code, to create Spring code, to-- I don't know what other abstraction levels, I don't know, to generate DSL code for us.

I think that's the thing that's new and that's also why it's now different for us as developers to figure out how to use that because it's not this experience of I do something and all the rest is being taken care of for me by the machines but it's this thing that we have as an assistant but we still have to be in control and we still have to figure out what to do with the suggestions.

Mike: I think I've described it to people in some ways as kind of autocomplete on steroids. It's like a super powerful autocomplete with lots of context details available to it that very often does the right thing. It will write a block of code, 5 to 10 lines of code. It is roughly the code that I was thinking about writing and the superpower autocomplete has done that for me.

It's very accessible. You can just switch it on one of these tools and suddenly you're getting better suggestions in the IDE in this-- and they look very similar to single line autocomplete that you might have seen before in an IDE. It's very accessible, very familiar to developers.

The other thing that I think is worth pointing out is because it's in the IDE, you can stay in your development flow. If I'm trying to figure out a particular algorithm that probably there's a half-right answer somewhere on Stack Overflow for that or somewhere on a search engine, I have to flip out of my IDE and go to that whereas with one of these tools, I can describe what I'm trying to do either in a comment or a method name like Birgitta said and then the autogenerated code will very often have that algorithm in it that I was searching for and help me get that within the IDE without breaking my flow.

Not just that, but it does it in a very context-specific way and adapts the textbook algorithm to my current code structure and variable names and all that kind of thing. Again, that's actually making it more powerful than going and looking up a dry academic example of something and then needing to figure out how it's going to work in my code base.

Birgitta: Yes. I think that example also shows the other thing that is, of course, totally different from traditional code generators. That was a limited analogy because, of course, this is about a lot more than code generation. It's also about information discovery and lots of other things that I think we'll get to talk about in the next half hour or so.

Neal: I have an analogy and I want to run this by you. I want to run this by a couple of experts because I'm trotting out this analogy as to what impact this is going to have on developers and this is very closely related to what Mike was talking about.

In the 1970s, accountants spend most of their day recalculating paper spreadsheets by hand with calculators or adding machines. Spreadsheets, of course, completely eliminated that and made them way more productive and that had two instant effects. One is you could build way more complex spreadsheets and accountants became a lot more productive. That didn't necessarily make them better accountants. They just got rid of some of the busy work that they had as accountants. The other side effect is by 1980, if you are an accountant who didn't know how to use a spreadsheet, you had a hard time getting a job.

My analogy for that is I'm a developer and I know SQL more or less, but now, I'm trying to put together this complex interjoined thing. I try one and there's an error message, so I Google the error message, and I get it to change, hey, progress. I fiddle around with that for 45 minutes versus handing that off to a GenAI and say, "Generate this SQL for me," and just execute it and get on my work. That I think is the productivity boost. It's not going to make you a different programmer, it's just going to eliminate a lot of the little busy work like that for you along the way. What do you think of that analogy? Is that accurate or is that too far?

Mike: [laughs] Well, I think it depends on what timescale you're thinking about because I think it's accurate. I think it may not be far enough. Then, there's all sorts of subtleties to that. The SQL query thing's interesting because you can really cause your DBA some pain if you mess up a SQL query or you try to run something that isn't using an index or whatever.

There's some important stuff to think about beyond just did I get the right answer to the business outcome that I was trying to get, which I'm translating into a SQL query and it seems to run and I seem to get the right answer from that. There's an entire question about how you incorporate that into your technique of building software. Like, how are we testing our SQL queries is a really good question that comes up here immediately because how do I know it's right?

I'm going to show my Microsoft-isms here — do I just hit F5 and run the thing a few times and see whether it seems to be working or do I have a more robust mechanism for ensuring that the stuff I've written is correct? Obviously, we would advocate for automated unit testing and stuff like that so that we know that that stuff is right and we've got all those safety nets, but I think-- That's a question and then--

Neal: Let me interject just a second. The same way as an accountant, when you put a formula in a field, you don't just trust that that's exact-- you go back and verify and test. There's a little bit of an analog there, but I agree that it's much shallower than what the capabilities that we have in front of us and much more varied the things that we can do now. Sorry, I interrupted you at halfway through your thought, so I [crosstalk]...

Mike: No, no — Birgitta’s got her thoughts on it, I want to hear her thoughts more than mine! [Laughter]

Birgitta: Yes! I struggle with the analogy a little bit I think because this accountant spreadsheet thing is exactly in that orthodoxy I was describing before of raising the abstraction level of somebody's work with software. You have then a repeatable deterministic type of thing. Whereas I think this is opening up this more messy type of assistance that we get where we still have to be in the driver's seat and we have to understand the formulas that are being generated if they actually work or not.

So I think, yeah, maybe I see where you're going with it, but because it's also a software analogy, it's like I struggle with it because it's the same thing, this, like, "Oh, we're just raising the abstraction level and have this machine fill in the details for us." I think this is a different kind of-- What's the word? Yes. It's almost like a little bit of a paradigm shift for us to think about how to use a piece of software because it's actually--

They are bad software. It's not really behaving like other software that we're using, so that's why as users, we also have to approach it differently and it doesn't always work. That's going to be interesting how we have to change our mindset, how to approach software to help us with tasks.

Neal: Let me ask you a slight follow-up question because you've said the abstraction and I think that's a good way to think about that is raising the abstraction, but in a different way. One of the common pieces of advice we've always had is understand one abstraction below the abstraction you're working in, but we can't understand a lot of the things. Even the experts are not exactly sure how some of these things are producing some-- That's a difference in that abstraction boosting is that this is a lot more opaque in some ways.

Birgitta: That's another dimension of that messiness. Yes [chuckles].

Mike: It is opaque, but the stuff that it produces should not be opaque because the SQL or the JavaScript or whatever code that you are generating today, at least, we as developers need to understand that resulting code because that is our responsibility. It's our responsibility to ensure correctness, lack of security holes, performance characteristics, all that other stuff. The code that gets spit out is-- What's the opposite of opaque? It's transparent to us, I guess. Transparent. We can understand that because that's part of our job to understand how this stuff works.

The fact that the AI is opaque, I think, is quite interesting because what that leads to is non-obvious results, better structured and better-documented code bases produce better results when you use a GenAI coding assistant with them because the AI has more things to hook into to understand the structure of your code and your solution.

That's interesting. There are probably things to be learned about how you prompt this stuff, whether you write a comment, and then, get it to auto-complete for you, or whether just a method name, camel case method name, or however you're doing it is enough to suggest it or whether just moving your cursor to the right line in the file and then saying, "Please, suggest something now," because you can do that as well. That might be a different prompting style.

Prem: It looks like this is quite rich in terms of the conversation that we are having. I do want us to move on and help our listeners visualize how these things manifest themselves. You're talking about AI-assisted coding but what are the most popular ways in which you interact with these tools?

Birgitta: Yes. I think it's a really interesting space right now because it's very fast-moving. I have a virtual whiteboard somewhere where I keep dumping the new tools that I mentioned [chuckles] somewhere, and the list of coding assistance for your IDE is just growing and growing. It's just very hard to keep up. Also, the new features that are coming up.

Obviously, I think one that is very popular and very well-known is GitHub Copilot, and we're also using that on a lot of our accounts. Then, there's Tabnine, Codeium, Cursor. There's Codeium with an E, Codium without an E, Codey with an E, Cody without an E. Also, very creative naming apparently [chuckles] in the products as they're all coming up very fast. They have to come up with names fast.

Yes. It's an interesting time and I think developer IDE experience as well because I think-- We've seen that in the past when the JetBrains IDEs came up. That was a huge boost in productivity compared to the other IDEs that were around. Most of these coding assistants at the moment have two core features. One is this inline assistance while you're typing, it's doing this autocomplete in steroids that Mike was talking about.

Then, most of them also have a chat component where you basically have a large language model chatbot in your IDE and can ask it questions. Often, those chatbots then also have context of-- part of your code base. For example, the open files that you have.

Now, a lot of the other features that are coming up is-- Very common is something like, "Ask your code base." You can actually ask the chatbot something about your code base, like, "Where did I implement X, Y, Z," [chuckles] or stuff like that. The tool vendors are trying to do some indexing and turning your code base into something searchable that can be used to enhance the question to the large language model in the background. That's what a lot of them are doing.

Then, especially this IDE called Cursor, which is still quite new, actually. It's early days, so I think maybe [chuckles] not ready for productive use yet, but it's really interesting the ideas that they have for the user experience. Also, for the prompting in the inline assistance, you can use a little chat component. Then, they also have abilities to-- When you ask a question about your code base to more specifically point at the parts of the code base that you want answers on, or you can include and say, "I want you to also consider the documentation of the following library that I'm using," and then, it indexes that as well or it has some interesting prompting approach to help you debug something.

I've actually used this auto-debug feature with it once, and it goes into a chain of thought prompting loop where the model is trying to "reason" about what's going on in your error message and then says, "Ah, okay. There seems to be something wrong in your POM.XML file, so let me look for that place in your POM.XML. I couldn't find it, so I'm going to look into the POM.XML in the other submodules." It goes through that.

The one time I did that, it didn't actually work really well in terms [chuckles] of actually finding the right things and all of that, and it went into an endless loop at the end that I had to cancel, but I do think that this is a valid approach to let a large language model help me debug something. There are some really interesting ideas that are coming up there. Yes. It's interesting days. [Chuckles]

Prem: Yes. It looks like you're saying it can generate code. It can also help you navigate code. It can help you find some things that you don't know about and things of that sort. Mike, I think you are trying to say something or maybe there are other things as well.

Mike: Well, I just actually wanted to add something about using cloud-based services versus local deployment. Most of these services are software-as-a-service type things where you pay for a subscription and when you are getting help or prompting the model, you're actually sharing some of your code base and some of your workflow data with the provider of that tool.

Now, most of us are fairly comfortable with that because lots of organizations are using GitHub Enterprise anyway, and so they trust, say, GitHub with their source code, so by implication, they must trust that organization with using a copilot-style tool.

There are companies out there who are much more averse to using cloud services and products. We actually helped a large company do an on-premise deployment using an open-source large language model for code generation, a fairly lightweight IDE plugin for VS code. I just wanted to point out that there are options for doing this stuff that don't rely on sending your data and your code to a third party if that's something that's going to be difficult for your organization.

Prem: Yes. I've tried a few of these as well. I've usually used them to generate small snippets of code. They seem to work relatively okay, so I say, "Okay. Generate me a stack," or, "Generate me a push method on a stack." Those very simple things seem to work. Is it possible for these tools to be used in a even more holistic way where I wanted to generate an entire application, for example? Have you got any experience or thoughts on that?

Mike: I think that's one of those things that's going to evolve. Right now, I would call them good for method-level code generation, so 5, 10, 15, 20 lines of code maybe with good context, I think you can get some good outputs from that. We are starting to see small apps being built from prompts and there's various-- you can see various different systems online and videos of people doing this stuff.

I'm a little bit skeptical still. I feel like maybe we are being shown cherry-picked successes the 1 time in 10 that it worked and produced something good for us. I think we all know if you look at the trajectory of these AI systems and of technology in general, everything gets better. Something that half works today is going to work really great in six months' time.

I do think we should be thinking about what happens when the unit of code that can be produced starts to get bigger. Maybe the question to ask is if you can create a unit of code, I don't know, 500 lines big across several classes, or files, and the AI can do that in a consistent way and keep it coherent, what would you want to use that for? Does it have an impact on the kinds of things that we choose to do architecturally because AI can produce something of a certain size?

Birgitta: More promptable architecture.

Neal: See, I think architecture is the place where it's going to take the longest because it's all about doing this trade-off analysis in the current context, but where I can see whole applications being generated or things like simple CRUD applications or designing things that are really busy work like print preview dialogues and stuff that nobody wants to spend the time designing, that kind of stuff. I think it's the kind of busy work that frees up the humans to do the stuff that only a human can really do at this point. I think the scope will get bigger and bigger over time, but I'm skeptical that it'll ever be able to create really sophisticated software.

Birgitta: It's again, I think, something to think about is the combination of large language models with other things. Again, it always comes back to the abstraction levels. I was talking about low-code no-code in the beginning. Those are currently the most impressive demos of using this GenAI to create applications because low-code no-code already has this huge platform under the covers that does a lot of this stuff.

Then, when you prompt it to create a low-code no-code application, you also have the constraints that you would have when you build a low-code no-code application. It needs to be a specific use case, straightforward use case, a preview dialogue, you were just saying, [chuckles] or something like that. Still, this combination I find really interesting, not to think about, "Oh, but large language models have these limitations," but what if you combine them with other things?

For example, one of the coding assistants out there right now is by a company called Sourcegraph and they have an existing product that does code search. That is a product that really unders-- where the product already understands the structure of the code. It understands the abstract syntax tree and all of that. Now, they are combining that with the large language model-based coding assistant.

Usually, the language models, they don't really understand the code. They just see the tokens, the patterns. Now, combining that with a code search that actually understands the structure then gives it an extra power. It's just one example but I think that will be interesting to see how-- not just take the model, but how do you combine it with other things, and then, make the overall result better.

Mike: I think the low-code no-code is a really interesting example because the demo is impressive because you just give it a spec in English for an application and it spins the thing up. The question is always about what is that low-code no-code platform capable of doing and what trade-offs are we making by using it? We've talked about that stuff in the past. I won't belabor the point, but usually, if a low-code platform has picked an option for you in, for example, the way the UI works or the way stuff is stored in the database, you just have to go with that. You just have to go with whatever the platform has chosen.

Now, you've also got [chuckles] the added complication that potentially people who don't really know how to use the low-code platform are specifying apps in English or other languages and then getting an AI to generate them. That person now needs to have absolutely no knowledge of what that low-code platform really does under the covers. One of the things we've always advocated for or pointed out is that somebody in the organization needs to understand what the low-code platform is doing or what the abstraction is doing. As somebody on your dev team needs to know what the hex Spring is doing under the covers probably to-- if you're using that as an abstraction.

It's not like this is a new problem, but the alluring capability of, "Oh, we can just have business users specifying departmental applications using English, and then, a generative AI system creates those." I don't think we're getting away from the problem of needing to really understand what that system is doing, having at least one person in the organization who knows what's there.

Then, the other question would be if you want to change something about that, what do you do? Do you change the English that you use to specify the system and rerun the AI? If you keep the English constant and you run the AI twice, do you get the same application at the end because these things are non-deterministic? I don't know. If you turn the temperature down or give it a seed number so you get the same output. There's all sorts of actually important questions there if you really wanted to do this stuff for real rather than just having a flashy demo that makes everybody go, "Ooh, yes."

Neal: Well, and to use Birgitta's language about this, I think the reason it works well for low-code environments was because it's a very limited abstraction, but in some ways, the C programming language is an abstraction, too. It's just a lot less limited conceivably.

I think there's an interesting thing here. The number of possible variables going up and how fast it can cope with something. C, you can pretty much-- it's a low-level assembly language versus some very constrained ESL. The abstraction level between those things is really high. I'm curious as to how fast they can ascend or descend that abstraction stack.

Prem: Here is a thing that I've been wondering. This whole point of abstraction that all of us have raised. Now, here's the reality of it. Even when I'm working in a more conventional environment, let's say using the Java programming language or maybe the Spring framework, there are portions of it that I may not intimately understand. That's okay because it helps me get the job done in arguably a smaller amount of time than when I'm not using it.

Wouldn't that same thing apply here? The place where I'm taking this is, okay, I've got a certain level of things that I need to do. For example, it needs to be syntactically correct. Okay. I got that. It needs to be correct from the perspective of meeting the requirements that I have. Those requirements might be, okay, it works. It seems to work when I've got three rows in the database, or will it work when I've got 3 million rows in the database?

Okay. Now, does it move us to a point where I really need to get good at expressing those test scenarios as opposed to actually coding the thing? Does it shift the balance of power token? Now, if I'm able to express all of those acceptance criteria really, really well, then do I really care? I don't know how assembly language works. I really don't. I don't care. Does it move us to that level or is it too early to say?

Birgitta: One difference there, I think, would be maybe not take the assembly example, the Spring example. When I write code and I use Spring annotations, I am basically putting myself into the hands of the creators of the Spring framework. They built something that deterministically always works the same way and that they thought about and that works.

Of course, sometimes, I need to understand in detail what the annotation does, and sometimes I don't. If I get suggestions from a large language model that are based on all of the code that is out there on the internet, that sometimes works [chuckles] and sometimes doesn't, that's like a different quality.

I think that's the messiness where-- I mean, in a way you could say if you have a good quality assurance approach, if you're really confident in your tests and you let this get generated, then maybe it's a little bit more similar. It still needs to be extensible code because as long as humans we still have to extend the code, it also needs to be somehow readable and all of that. Yes. Again, it doesn't quite work because the traditional abstraction layers are really well-defined, deterministic, and all of that. Here's all of this messiness in this space. It's always slightly different.

Mike: Well, and what you just said there and was really interesting, Birgitta, because it still needs to be read and evolved by humans. Actually, that's one of the most remarkable things about GenAI creating code today is that code, we all know this, is actually for communicating amongst programmers and for me communicating to myself in the future because I've forgotten what the heck the thing was supposed to do.

The fact that AI can generate human-readable and human-evolvable code is remarkable and very useful to us. I do wonder if we move to a mode of specifying functionality and ascertaining correctness through good acceptance testing, whether that still needs to be true because if you have a block of impenetrable AI-generated code that is super optimized, but no human could read it, but you have all these tests around the outside to guarantee that it is doing what you want it to do, does it matter that it's not human readable anymore?

Prem: Yes. That's exactly what I was thinking. If I've got these suite of fitness functions and it basically says, okay, this is what it does. I don't really care. I don't want to see what lies underneath because it seems to obey all of these fitness functions that I've written against it. Great. That's all that matters. Then, now, if there is a bug, I say, "Here is an additional fitness function that you need to adhere to." Then, as long as it does that, I'm like, "Yes. Bring it on. Write the most complex code. Write the most ugly code unreadable." I don't care. It's obeying all of my acceptance criteria.

Birgitta: Then, the key thing is you wrote those fitness functions, so there's a nice blog post by Michael Feathers about should you generate your tests and your code, [chuckles] because if you also use these tools to generate your tests-- and I think it's-- I wouldn't say I'll never use them to generate your tests. I think that's a bit too extreme. As always, it depends, but I think you definitely then have to be a lot more diligent when you generate your tests because you at least want to be sure that you're having good test coverage, good test scenarios, have covered all of your bases. Maybe you do things like-- What's it called? Where you do all the-- know the testing where you generate all of those different types of--

Mike: Fuzz testing or--?

Neal: Mutation testing.

Birgitta: Mutation testing. [Chuckles] Thank you. [Laughs] Yes. Maybe something like that, property-based testing, all of those things. Yes. Maybe new ways to think about our testing approach to maybe invest even more in testing. Then, again, we have to see how does it balance out? Do we actually get a net positive after, at the end, when we invest even more in testing. [chuckles]

Neal: Well, this is a slight digression, but the closure community has a fascinating way of testing stuff. They do statistically-based testing. Rather than write unit tests, they write this test suite and have the system generate all these possible outcomes to see if it statistically is producing the right stuff.

I can definitely see that approach being used against something like-- something that's been generated that is maybe a little bit opaque but still does the stuff we want it to do, and we don't really care how it's getting the job done as long as it's correct, but then, using some statistics and things like that to determine that it is, in fact, doing the right thing.

Prem: That leads us to the next section, which is very popular, at least among stock workers in terms of how we approach problems, testing, and development. A lot of us like doing that. How does this thing affect that flow where I'm writing the tests or I'm writing a test and then writing a bit of production code, see that pass, and then, keep doing this again and again until the time that I can't think of something. How does that get affected?

Birgitta: One of the things that you'll also notice when you use this coding assistance is that they often suggest quite a few lines of code to you. Sometimes, it's also frustrating because they only go line by line, but sometimes, they give you 15 lines at once. Usually, when we do TDD, really, in a pinging pong style, we usually go step by step. What's the minimum smallest next thing that we want to work, that we want the test to be green?

Now, there's this partner that we work with who just doesn't understand this, [chuckles] but there's actually without going too much into detail in the podcast conversation, but we have a bunch of memos out on Martin Fowler's website about this. If you click on his GenAI tech on the website, you'll find it where our colleague Paul Sobocinski did a little write-up about their experience with doing this TDD ping pong style and sometimes also going, okay, I adjust my test and then I actually delete my full implementation and have it regenerate because every time I put a new assertion, that's like a new factor of the prompt.

I just delete my whole function and then the coding assistant will pick up my test in the other file and will potentially give me like a more fuller implementation of what I did. It contains little things like that, and he's going through the red-green refactor cycle and how a coding assistant would be helpful or sometimes not helpful in that. [chuckles]

Neal: It sounds like you've tiptoed right up to the inevitable question that people ask Thoughtworkers about GenAI. Is it a pair programmer? [chuckles]

Birgitta: Yes. I think the GitHub Copilot is even-- the tagline is your AI pair programmer, which annoys me a little bit, [chuckles] have to a little bit, like I don't--

Neal: More than a little bit for a lot of-- [chuckles]

Birgitta: The product's really great, but that part annoys me. Yes. It's like a little frustrating to us, I think, because we're big proponents of pair programming and it's very often misunderstood as it's this-- when there's two people, this one person will know the syntax when the other person doesn't. I'm exaggerating but it's about when-- filling knowledge gaps or something, but it's actually--

Of course, a tool like GitHub Copilot or the others can help with some of that stuff sometimes even better than a human maybe. Actually, the point of the pair programming practice is to make the team better, not just the individual coder. It's about having the context of what's going on, knowing this person wasn't a story kickoff discussion, this person wasn't, maybe that person is-- one person is out sick tomorrow, and the other person has the context, or it's also about--

It's often one of the only spaces left in remote working where we actually informally collaborate [chuckles] with each other and code and where we can have bonding and stuff like that. It's so many things that the robot cannot help us with, so it's a bit frustrating that now, this-- equating those two things is doing a disservice to what the practice actually does.

Neal: It's inevitable, though, don't you think? [chuckles] That they're all going to pitch themselves this way? [chuckles]

Mike: Well, I think it depends what you are trying to get out of it. If you are clear about what you're trying to use the tool for, and the things that the tool cannot do, all of the stuff that Brigitta just talked about, the tool can be useful. There's all this discussion right now about whether AI is actually intelligent or whether it's just patent matching, spitting stuff out, and all this kind of stuff.

Some part of me doesn't care because it's useful. I've got this useful tool and it's providing utility to me and, no, you can't have my license back. Sorry. I want that. I'm going to keep using it. Thanks very much. Some of the stuff that these tools could do is a super useful contribution to the development process like looking over your code and the changes that you've just made and determining does that have any security flaws to it, asking questions like that about the code that you've just done or reminding you, "Hey, did you run a performance test on this? It looks like you're accessing--" I don't know, it looks like you've got an n + 1 select loop going on there. Are you sure that that's what you intended to do?

Those kinds of things are useful and would otherwise require time from more senior teammates to look over your code to remind you to do that kind of stuff. I'm all for using these things when it frees us up to think about things that are more in the human domain of thinking about, like, is my architecture going in the direction that I want it to? How do these system components relate to each other? Do I really understand the requirement that's being asked of me? Is this feature conflicting with another feature that's coming down the road that we did last week? Those are things that are much harder for an AI to help you with, but the nuts and bolts of the task of coding, it seems is getting more and more accessible to AI.

Prem: Makes sense. Makes sense. Brigitta and Mike, we can keep going on this, but we do have to end this. Any parting thoughts from you folks in terms of what we can expect from this and where we should use it or not use it or anything else that people who are trying to get started on this should be aware of?

Mike: The bee in my bonnet on this is that everybody should be trying this stuff out. The capabilities of generative AI change literally every week. The only way for you to really understand what this can do and whether it can work for you is to try it and actually try it in a whole bunch of different ways.

I would always advocate for responsible experimentation. Don't go to work and install one of these tools without your employer's [chuckles] permission without them understanding that you're doing that because that leads to very bad outcomes, but if, say, you have an open-source project that you are working on or some personal code or something like that, most of these tools have a free trial available. You can get started with them and start understanding what they do.

That would be my very strong recommendation to anyone is to just try this stuff. It's not perfect. It's not going to replace you as a programmer. Don't worry about that, but like the accountants and the spreadsheets, you, as a programmer, should understand how you can use this tool in your craft. There's lots of people, I get the impression, making excuses for it. I guarantee you. Once you try this stuff, your eyes will be open and I think you will like incorporating this tool.

Prem: Brigitta?

Brigitta: Yes. First of all, plus one to that definitely. I was mentioning before, this is different from other software that we've used. You can't just go to a training, somebody tells you, "These are the rules of the tool. This is how it works," and then, you apply those rules. There's lots of things that you can't really explain why is that happening. [chuckles] You have to get a feeling for it. Also, for things like, oh, sometimes going in small steps helps or I think you really-- Yes, have to try it out. I also feel even though I'm in a full-time role right now looking into these things, I find that I have to train myself to remember to try things out, so to-- because I'm just not used to it.

I think we all-- if this is really going to be part of our lives in the future, I think we have to cognitively change a little bit how we think about things. I recently read an article where-- about this where the author mentioned how we all collectively cognitively changed when internet search came around from training ourselves to remember facts, to training ourselves, to know how to get the facts right. She was saying that she expects something similar to happen with this. We just don't know yet how it's going to go.

Then, the other thing is what I mentioned before, to think about the combinations, to not just think about, "Oh, these are the limitations of the models," but to think creatively about how you can combine it with other things.

Neal: Great. That's a great way to wrap up. Thank you, Brigitta and Mike for helping us sip from the firehose of information about AI. This is not the last podcast we're going to do about AI in general and AI-assisted software development, but we wanted to dip our toe into the ocean of information before it became too overwhelming. Thank you so much for joining us and thanks, Prem, for hosting with me today.

Prem: Thank you, folks, and we will see you next time.

[Music]

View less

More episodes

Episode name

Published

The three new fallacies of distributed computing

June 26, 2025

MCP and SRE: Why the future of IT operations is agent-driven

June 12, 2025

Unpacking Google I/O 2025

May 29, 2025

Accelerating mainframe modernization using generative AI

May 15, 2025

Exploring the fundamentals of software engineering

May 01, 2025

Themes in Technology Radar Vol.32

April 17, 2025

We need to talk about vibe coding

April 02, 2025

Infrastructure as code in 2025

March 20, 2025

How fitness functions can help us govern and measure AI

March 06, 2025

Architecture as code

February 19, 2025

Decoding DeepSeek

February 06, 2025

AI testing, benchmarks and evals

January 23, 2025

Exploring the intersections of software architecture

January 09, 2025

Who should make software architecture decisions?

December 26, 2024

Generative AI's uncanny valley: Problem or opportunity?

December 12, 2024

Using generative AI for legacy modernization

November 28, 2024

Data contracts: What are they and why do they matter?

November 14, 2024

Themes from Technology Radar Vol.31

October 17, 2024

Build Your Own Radar: Using the Technology Radar as a governance tool

October 03, 2024

Exploring DuckDB: A relational database built for online analytical processing

September 19, 2024

Software service granularity: Getting it right

September 05, 2024

Measuring developer experience

August 22, 2024

How can AI support designers?

August 08, 2024

Sensible defaults: A way to think about our technology practices

July 25, 2024

Tracking technology stacks, practices and experiences across teams

July 11, 2024

Inside Bahmni: An open-source digital public good

June 27, 2024

How to assess your organization's security maturity

June 13, 2024

Continuous delivery vs. continuous deployment: What should be the default?

May 30, 2024

Themes from Technology Radar Vol.30

May 16, 2024

Building at the intersection of machine learning and software engineering

May 02, 2024

Refactoring with AI

April 18, 2024

How to measure your cloud carbon footprint

April 04, 2024

Technology through the Looking Glass: Preparing for 2024 and beyond

March 21, 2024

Diving head first into software architecture

March 07, 2024

Exploring the building blocks of distributed systems

February 22, 2024

Software-defined vehicles: The future of the automotive industry?

February 08, 2024

Beyond the DORA metrics: Measuring engineering excellence

January 25, 2024

Asynchronous collaboration: Getting it right

January 11, 2024

Looking back at key themes across technology in 2023

December 28, 2023

Leveraging generative AI at Bosch

December 14, 2023

Jugalbandi: Building with AI for social impact

November 30, 2023

AI-assisted coding: Experiences and perspectives

November 16, 2023

What's it like to maintain an award-winning open source tool?

November 02, 2023

Engineering platforms and golden paths: Building better developer experiences

October 19, 2023

Managing cost efficiency at scale-ups

October 03, 2023

Exploring SQL and ETL

September 21, 2023

Driving innovation in radio astronomy

September 07, 2023

XR with impact: Building experiences that drive business value

August 24, 2023

Leadership styles in technology teams

August 10, 2023

Making design matter in technology organizations

July 27, 2023

Generative AI and the future of knowledge work

July 13, 2023

Scaling mobile delivery

June 29, 2023

Making privacy a first-class citizen in data science

June 15, 2023

Multi-cloud: Exploring the challenges and opportunities

June 01, 2023

Scaling up at Etsy

May 18, 2023

TinyML: Bringing machine learning to the edge

May 04, 2023

The weaponization of complexity

April 20, 2023

How we put together the Technology Radar

April 06, 2023

Inside India's Drug Discovery Hackathon

March 23, 2023

Serverless in 2023

March 09, 2023

My Thoughtworks journey: Rebecca Parsons

February 23, 2023

How to tackle friction between product and engineering in scale-ups

February 09, 2023

6 key technology trends for 2023

January 26, 2023

Tackling system complexity with domain-driven design

January 12, 2023

Shifting left on accessibility

December 29, 2022

Data Mesh revisited

December 15, 2022

Low-code/no-code platforms: The 10% trap and the limits of abstractions

December 01, 2022

Welcome to the fediverse: Exploring Mastodon, ActivityPub and beyond [Special]

November 24, 2022

Rethinking software governance: Reflecting on the second edition of Building Evolutionary Architectures

November 17, 2022

Reckoning with the force of Conway's Law

November 03, 2022

Exploring the Basal Cost of software

October 20, 2022

Why full-stack testing matters

October 05, 2022

Acknowledging and addressing technical debt in startups and scale-ups

September 22, 2022

XR in practice: the engineering challenges of extending reality

September 08, 2022

Agent-based modelling for epidemiology: EpiRust and BharatSim

August 19, 2022

Mastering architectural metrics

August 12, 2022

Building a culture of innovation

July 28, 2022

Starting out with sensible default practices

July 14, 2022

Better testing through mutations

June 30, 2022

Patterns of legacy displacement — Part two

June 16, 2022

Patterns of legacy displacement — Part one

June 02, 2022

Mitigating cognitive bias when coding

May 19, 2022

Following an usual career path: from dev to CEO

May 05, 2022

Software engineering with Dave Farley

April 21, 2022

Tackling bottlenecks at scale-ups

April 07, 2022

Coding lessons from the pandemic

March 24, 2022

Is there ever a good time for a code freeze?

March 10, 2022

Navigating the perils of multicloud

February 25, 2022

Compliance as a product

February 10, 2022

The big five tech trends for 2022

January 27, 2022

Fluent Python revisited

January 13, 2022

Creating a developer platform for a networked-enabled organization

December 30, 2021

The art of Lean inceptions

December 16, 2021

The hard parts of data architecture

December 02, 2021

TDD for today

November 18, 2021

You can't buy integration

November 04, 2021

The rise of NoSQL

October 21, 2021

The hard parts of software architecture

October 07, 2021

Machine learning in the wild

September 24, 2021

Delivering innovation at scale

September 09, 2021

Jim Highsmith: a 54-year agile journey

August 26, 2021

Securing the software supply chain

August 12, 2021

Making retrospectives effective — and fun

July 22, 2021

Patterns of distributed systems

July 08, 2021

Refactoring databases — or evolutionary database design

June 24, 2021

Making developer effectiveness a reality

June 10, 2021

Team topologies and effective software delivery

May 20, 2021

How green is your cloud?

May 07, 2021

Green software engineering

April 22, 2021

Twenty years of agile

April 08, 2021

Talking with tech leads with Pat Kua

March 25, 2021

My Thoughtworks Journey: Patricia Mandarino

March 11, 2021

Exploring infrastructure as code

February 25, 2021

XR in the enterprise

February 11, 2021

Getting to grips with data visualization

January 21, 2021

Computational notebooks: the benefits and pitfalls

January 07, 2021

The architect elevator

December 24, 2020

The future of Clojure

December 10, 2020

The future of digital trust

November 27, 2020

Integration challenges in an ERP-heavy world — Pt 2

November 12, 2020

Democratizing programming

October 28, 2020

Integration challenges in an ERP-heavy world

October 16, 2020

Models of open sourcing software

October 01, 2020

Applying software engineering practices to data science

September 17, 2020

Using visualization tools to understand large polyglot code bases

September 03, 2020

Machine learning in astrophysics

August 20, 2020

Programming languages geek out

August 06, 2020

Observability does not equal monitoring

July 23, 2020

Working with 50% of code in the browser

July 09, 2020

Realising the full potential of CD

June 25, 2020

Testing the user journey

June 12, 2020

Continuous delivery in the wild

June 01, 2020

Lessons from a remote Tech Radar

May 13, 2020

The future of Python

April 30, 2020

A sensible approach to multi-cloud

April 17, 2020

Digital transformation: a tech perspective

April 02, 2020

IT delivery in unusual circumstances

March 20, 2020

Continuous delivery for today's enterprise

March 06, 2020

Fundamentals of Software Architecture

February 21, 2020

Cloud migration — part two

February 10, 2020

The price of reuse

January 24, 2020

Towards self-serve infrastructure

January 13, 2020

Martin Fowler: my Thoughtworks journey

December 27, 2019

Building an autonomous drone

December 13, 2019

Cloud migration is a journey not a destination

November 28, 2019

Getting to grips with functional programming

November 14, 2019

Compliance as code

November 01, 2019

Data meshes: a distributed domain-oriented data platform

October 18, 2019

Edge — a guide to value-driven digital transformation

October 04, 2019

Tech choices: CIO or CTO?

September 20, 2019

Microservices as complex adaptive systems

September 05, 2019

Supporting the Citizen Developer

August 22, 2019

Getting hands-on with RESTful web services

August 08, 2019

Zhong Tai: innovation in enterprise platforms from China

July 25, 2019

What’s so cool about micro frontends?

July 11, 2019

Unravelling the monoglot monopoly

June 27, 2019

Breaking down the barriers to innovation

June 13, 2019

Delivering strategic architectural transformation

May 30, 2019

Exploring programming languages via paradigms vs labels

May 16, 2019

Multicloud in a regulated environment

May 03, 2019

Can DevSecOps help secure the enterprise?

April 18, 2019

A11Y — Making web accessibility easier

April 04, 2019

Continuous delivery for modern architectures

March 21, 2019

Delivering developer value through platform thinking

March 07, 2019

Architectural governance: rethinking the Department of ‘No’

February 21, 2019

Serendipitous Events

February 08, 2019

Diving into serverless architecture

January 24, 2019

Seismic Shifts

January 10, 2019

Understanding bias in algorithmic systems

December 28, 2018

Microservices: The State of the Art

December 14, 2018

Evolving Interactions

November 29, 2018

The state of API design

November 15, 2018

How we build the Tech Radar

November 01, 2018

IoT Hardware

October 18, 2018

Continuous Intelligence

October 04, 2018

Distributed systems antipatterns

September 13, 2018

Agile Data Science

August 23, 2018

Solutions

Industries

Resource Hubs

Publications and Tools

All Insights

AI-assisted coding: Experiences and perspectives

Brief summary

Episode transcript

Find out what's happening at the frontiers of tech