Exploring the Potential of ChatGPT as a Coding Partner (part 2)

Ryszard Balcerzak

Published: May 09, 2023

This is the second part of my blog post, where I’m evaluating ChatGPT's strengths and weaknesses as a programming partner.

In part one, my objective was to assess ChatGPT's capability to model a basic domain concept and to see how it responds to changes in requirements. Although seeing a solution generated in just a matter of seconds was impressive, I was disappointed with the limited interaction it provided; ultimately, our collaboration was one-sided. In this second part, I’ll build upon those findings and examine ChatGPT’s potential to suggest solutions that can be applied in real-life scenarios, particularly those involving a greater number of invariants and complexities. The goal is not necessarily to create more advanced solutions, but rather to identify practical and effective approaches that can be implemented in the real world.

Use case 2: Handling invariants that span across multiple aggregates

In this scenario, I wanted to examine how the model could handle an invariant that crossed the boundary of a single aggregate.

The goal was to model the user's account and enforce the uniqueness of the e-mail address.

R: Can you use C# to model a user account?

Here are the requirements:

User account should have three properties: UserName, Password and Email.

Email should be unique across all user accounts.

ChatGPT: Yes, you can use C# to model a user account with the properties

UserName, Password, and Email. To ensure the uniqueness of the email

property, you can add a validation check in the setter of the Email property,

which will throw an exception if the email already exists in the database.

Here is an example of a C# class that models a user account:

public class UserAccount
{
    private static List<string> _emailList = new List<string>();
    public string UserName { get; set; }
    private string _password;
    public string Password
    {
        get { return _password; }
        set
        {
            if (value.Length < 8)
                throw new Exception("Password must be at least 8 characters long");
            _password = value;
        }
    }
    private string _email;
    public string Email
    {
        get { return _email; }
        set
        {
            if (_emailList.Contains(value))
                throw new Exception("Email already exists");
            _emailList.Add(value);
            _email = value;
        }
    }
}

The suggestion it made was to employ a private static field to keep track of all email addresses, which certainly isn’t the best solution. Additionally, it made an assumption about password length, which I hadn’t specified in the requirements. In the first step, I wanted to clarify the picture and eliminate the unnecessary rule regarding the password.

R: Let's not make any assumptions about the password just yet.

ChatGPT: Okay, here is an updated example without any password

validation:

public class UserAccount
{
    private static List<string> _emailList = new List<string>();
    public string UserName { get; set; }
    public string Password { get; set; }
    private string _email;
    public string Email
    {
        get { return _email; }
        set
        {
            if (_emailList.Contains(value))
                throw new Exception("Email already exists");
            _emailList.Add(value);
            _email = value;
        }
    }
}

Then it was time to concentrate on the static field. There were some issues with this approach, but I began with the most obvious one, addressing how to manage the pre-existing email addresses.

R: How would you populate _emailList field with existing email addresses?

ChatGPT: The `_emailList` field is a static field, so it is shared by all

instances of the `UserAccount` class. You can populate the `_emailList`

field with existing email addresses using the following methods:

Hard-code the existing email addresses into the list:

private static List<string> _emailList = new List<string> { 
"user1@email.com", "user2@email.com", ... };

2. Load the existing email addresses from a database or a file:

private static List<string> _emailList = new List<string>();
static UserAccount()
{
    // Load existing email addresses from a database or a file
    var existingEmails = LoadEmailsFromDBOrFile();
    foreach (var email in existingEmails)
    {
        _emailList.Add(email);
    }
}
private static IEnumerable<string> LoadEmailsFromDBOrFile()
{
    // Load email addresses from a database or a file and return them as a list
    // ...
}

Two solutions were proposed. Both had drawbacks. Hardcoding the email addresses was completely unrealistic, so I turned my attention to the second option. However, that approach resulted in a breach of the domain model's isolation.

R: Doing this this way we allow UserAccount instances to interfere with

the outside world making the class not properly isolated. Moreover since

we are implicitly dependent on a database it's hard to test this class.

ChatGPT: Yes, you are correct. Storing the email addresses in a static list and

loading them from a database or a file is not the best approach, as it creates

tight coupling between the `UserAccount` class and the outside world,

making it difficult to test and maintain.

A better approach would be to inject the repository into the `UserAccount`

class, so that it can check the uniqueness of the email addresses without being

tightly coupled to the database or a file. Here's an example of how you could

do this:

public class UserAccount
{
    private readonly IEmailRepository _emailRepository;
    public UserAccount(IEmailRepository emailRepository)
    {
        _emailRepository = emailRepository;
    }
    public string UserName { get; set; }
    public string Password { get; set; }
    private string _email;
    public string Email
    {
        get { return _email; }
        set
        {
            if (_emailRepository.Exists(value))
                throw new Exception("Email already exists");
            _email = value;
        }
    }
}
public interface IEmailRepository
{
    bool Exists(string email);
}
public class EmailRepository : IEmailRepository
{
    private readonly List<string> _emails = new List<string>();
    public bool Exists(string email)
    {
        return _emails.Contains(email);
    }
}

ChatGPT: With this approach, you can create a mock repository for testing,

and use a real repository for production. The `UserAccount` class is now

isolated from the outside world and can be tested more easily.

The revised code was just a minor enhancement over the previous proposal. We simply made the implicit direct dependence on the database explicit. It remains a dependency though. Implementing an interface in this scenario does not provide a useful abstraction, as the database isn’t a domain concept in our example.

R: In this example IEmailRepository is a volatile dependency that deals

with the outside world. By injecting it to the domain model we break its

isolation.

ChatGPT: Yes, you are correct. The `IEmailRepository` interface is a volatile

dependency that deals with the outside world, and injecting it into the

`UserAccount` class can break its isolation.

One approach to solve this issue is to use a Value Object pattern. A Value

Object is an immutable object that represents a simple value, such as an

email address. The Value Object can be used to validate the uniqueness of

the email address without involving the outside world.

Here's an example of how you could do this:

public class UserAccount
{
    public string UserName { get; set; }
    public string Password { get; set; }
    public EmailAddress Email { get; set; }
    private static HashSet<EmailAddress> _emailAddresses = new HashSet<EmailAddress>();
    public UserAccount(string userName, string password, EmailAddress email)
    {
        UserName = userName;
        Password = password;
        if (!_emailAddresses.Add(email))
            throw new Exception("Email already exists");
        Email = email;
    }
}
public struct EmailAddress : IEquatable<EmailAddress>
{
    public string Address { get; }
    public EmailAddress(string address)
    {
        Address = address;
    }
    public bool Equals(EmailAddress other)
    {
        return string.Equals(Address, other.Address, StringComparison.OrdinalIgnoreCase);
    }
    public override bool Equals(object obj)
    {
        return obj is EmailAddress other && Equals(other);
    }
    public override int GetHashCode()
    {
        return StringComparer.OrdinalIgnoreCase.GetHashCode(Address);
    }
    public static bool operator ==(EmailAddress left, EmailAddress right)
 {
        return left.Equals(right);
    }  
    public static bool operator !=(EmailAddress left, EmailAddress righ
)
 { 
        return !left.Equals(right);
    }
}

ChatGPT: With this approach, the `UserAccount` class is completely

isolated from the outside world, and the uniqueness of the email address is

validated by the `EmailAddress` Value Object.

By that time I noticed it had run out of ideas. It stubbornly attempted to keep a collection of email addresses as a static field, only the implementation was changing. The problem with populating the collection was omitted, I think, because of the problem with a lack of proper domain model isolation. Maybe chat just didn’t want to deal with that anymore…

In this situation, a value object wouldn't resolve the issue, but its defined structural equality was needed to keep the instances of EmailAddress in a HashSet. Even though using value objects is a good approach in complex domain models, keeping that collection in memory is not an optimal solution in a typical real world scenario.

Here you can see the lack of communication between us. I may have failed in expressing my thoughts and provided insufficient information, but ChatGPT didn’t attempt to clarify the situation. It was just offering random solutions. During pair programming, the process is different: Developers first attempt to understand the problem and then, through collaboration, they create new perspectives which leads to a broader understanding and more effective approaches to solving a problem.

Anyway, I decided to continue.

R: How would you populate _emailAddresses field with existing email

addresses?

ChatGPT: To populate the `_emailAddresses` field with existing email

addresses, you could use a constructor or a method that takes a list of

email addresses as a parameter.

Here's an example of how you could do this:

      public class UserAccount
{
    public string UserName { get; set; }
    public string Password { get; set; }
    public EmailAddress Email { get; set; }
    private static HashSet<EmailAddress> _emailAddresses = new HashSet<EmailAddress>();
    public UserAccount(string userName, string password, EmailAddress email)
    {
        UserName = userName;
        Password = password;
        if (!_emailAddresses.Add(email))
            throw new Exception("Email already exists");
        Email = email;
    }
    public static void LoadExistingEmails(IEnumerable<EmailAddress> emailAddresses)
    {
        _emailAddresses = new HashSet<EmailAddress>(emailAddresses);
    }
}
// Note: Implementation of 'EmailAddress' has been removed for brevity.
public struct EmailAddress : IEquatable<EmailAddress> {...}
      
    

ChatGPT: In this example, you can call the `LoadExistingEmails` method

with a list of email addresses to populate the `_emailAddresses` field. This

method can be called when the application starts, or when you need to

refresh the list of email addresses.

I felt we were going nowhere but still wanted to continue.

R: We have to change this as we will have hundreds of thousands of user

accounts. This solution will have some performance drawbacks.

ChatGPT: Yes, you are correct. If you have hundreds of thousands of user

accounts, using a hash set to store all email addresses may not be the best

solution.

One approach to improve the performance is to use a database to store the

email addresses. You can create a table to store the email addresses and

use a repository to interact with the database.

Here's an example of how you could do this:

public class UserAccount
{
    public string UserName { get; set; }
    public string Password { get; set; }
    public EmailAddress Email { get; set; }
    private readonly IEmailRepository _emailRepository;
    public UserAccount(string userName, string password, EmailAddress email, IEmailRepository emailRepository)
    {
        UserName = userName;
        Password = password;
        if (!_emailRepository.IsEmailUnique(email.Address))
            throw new Exception("Email already exists");
        Email = email;
        _emailRepository = emailRepository;
    }
}
// Note: Implementation of 'EmailAddress' has been removed for brevity.
public struct EmailAddress : IEquatable<EmailAddress> {...}
public interface IEmailRepository
{
    bool IsEmailUnique(string email);
}

ChatGPT: With this approach, you can use a database to store the email

addresses, and the `IEmailRepository` interface can be used to interact

with the database. The repository can be implemented to query the

database to check if the email address is unique, which can improve the

performance compared to using a hash set.

We made some progress but it was a long journey. We still suffered from having a domain model that wasn’t properly isolated.

R: Again, our domain model is not properly isolated now.

ChatGPT: Yes, you are correct. Injecting the `IEmailRepository` into the

`UserAccount` class creates a dependency on an external resource,

breaking the isolation of the domain model.

One solution to this issue is to use the repository only in the application

layer and keep the domain model isolated. The application layer can then

use the repository to check if the email address is unique before creating a

new `UserAccount` instance.

Here's an example of how you could do this:

public class UserAccount
{
    public string UserName { get; }
    public string Password { get; }
    public EmailAddress Email { get; }
    public UserAccount(string userName, string password, EmailAddress email)
    {
        UserName = userName;
        Password = password;
        Email = email;
    }
}
// Note: Implementation of 'EmailAddress' has been removed for brevity.
public struct EmailAddress : IEquatable<EmailAddress> {...}
public interface IEmailRepository
{
    bool IsEmailUnique(string email);
}
public class UserAccountService
{
    private readonly IEmailRepository _emailRepository;
    public UserAccountService(IEmailRepository emailRepository)
    {
        _emailRepository = emailRepository;
    }
    public UserAccount CreateUserAccount(string userName, string password, string emailAddress)
    {
        if (!_emailRepository.IsEmailUnique(emailAddress))
            throw new Exception("Email already exists");
        var email = new EmailAddress(emailAddress);
        var userAccount = new UserAccount(userName, password, email);
        return userAccount;
    }
}

ChatGPT: With this approach, the domain model remains isolated, and the

`UserAccountService` class is responsible for creating new `UserAccount`

instances, using the `IEmailRepository` to check if the email address is

unique.

Yes, that was an acceptable solution. However, as we had only just begun to model the UserAccount, I decided to stop there.

Use case 2: Summary

This scenario was somewhat more complicated than the previous one, but still relatively straightforward. The main challenge was finding the best approach. Although we reached a satisfactory solution, I still felt the need to steer the tool towards a proper direction. At times, ChatGPT’s proposed approaches didn’t offer any interesting insights. On the contrary — I had to carefully examine the generated code and identify the problems. In the end I just didn’t trust it to provide a good answer. That makes it risky: I can imagine that for some developers at the beginning of their programming career, this could be like stepping on a minefield.

What was interesting was that it clearly didn't understand what the concepts we were trying to model represent in the real world. I think this is one of the reasons why the proposed solutions were often unrealistic. Unfortunately, because ChatGPT cannot analyze problems and sometimes struggles with communication our “collaboration” was quite one-sided.

I think we should be very clear about what we ask it to do as there is no place for any clarification from its side. As mentioned before, it lacks the spirit of collaboration. The creativity and connection that comes from actually being with another person can lead to a flow of alternative ways of thinking about a problem; it can provide new perspectives that might not have been considered before.

Final thoughts

Throughout my career, I have enjoyed numerous collaborative sessions with my colleagues. If I had to pinpoint one driving force behind these sessions, it would be the collaboration itself. We ask questions, clarify, and use our critical thinking to examine problems and potential solutions. We can see the whole picture. Although our judgment may not always be completely accurate, we can still offer valuable insights that make our work effective. We have intuition that can sometimes mislead us, but it can often also be reliable. We have experience and often we have at least a basic understanding of things that we try to model in software engineering. We can also find connections and patterns between different abstract concepts. ChatGPT can't do any of these things. Thus, ChatGPT can’t give anywhere near the same pairing experience.

However, I’m a little bit torn — I still think it’s a great tool despite its shortcomings. The way of interacting with it indeed may be revolutionary. I asked ChatGPT a simple question: "Can you code?" as I was curious about the answer. Well, I should have asked this in the beginning, but I think I had made too many assumptions due to all the hype. This was its response:

“Yes, I can write code in several programming languages, including Python, Java, JavaScript, and more. However, my responses are generated via machine learning algorithms, so while I can understand code and write it in a grammatically correct manner, I may not always produce correct or optimal solutions.”

On reflection, this was a fair response. I think I was expecting too much from it. These are exactly the challenges I faced while trying to collaborate with it. However, I nevertheless believe ChatGPT has potential as a coding assistant. Although the solution generated by ChatGPT may not be optimal, it's still impressive that it can generate code almost immediately based on given requirements. This can provide a solid starting point for further refinement and improvement, and it's especially useful in writing code that doesn't require a lot of abstract thinking, such as application or infrastructure code. On the other hand, we could focus on the core domain and its complexity. I think it can assist us also in writing unit tests according to requirements. In fact, ChatGPT is designed to adhere to best coding practices with the aim of producing readable code for humans. Development practices were established to address complexity management issues for humans. Computers, however, do not have these limitations and may not follow human development practices at all. ChatGPT does. In addition, given its strengths, I believe it has the potential to assist people with disabilities in writing code, which is invaluable.

In conclusion, ChatGPT has demonstrated its potential as a valuable assistant in certain tasks. However, it should be noted that its limitations in true collaboration and human-like understanding make it less effective as a partner. While ChatGPT can undoubtedly enhance our capabilities in various areas, it is still important to recognize its role as a tool rather than a true equal partner.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.

Solutions

Industries

Resource Hubs

Publications and Tools

All Insights

Exploring the potential of ChatGPT as a coding partner

Use case 2: Handling invariants that span across multiple aggregates

Use case 2: Summary

Final thoughts

Related Blogs

Keep up to date with our latest insights