This is the final part of a three-part series of articles exploring well-factored service architecture. Read Part One to know more about benefits of well-factored services. In Part Two we talk about the service classification model — foundational capabilities, business capabilities, and client experience services — their key characteristics and guidelines around building them. In this part, we talk about guidelines for defining service boundaries in general.
The right service boundaries hide complexity – domain and technical
In Part One, we talked about business-truthful abstractions being the key to well-factored service architecture. A key characteristic of services based on business-truthful abstractions is to hide complexity that would otherwise leak out of the service. Hiding complexity in the service makes the service easy to consume and more importantly, use it to compose higher order capabilities and user experiences quickly and reliably.
Service complexity comes in two forms — domain and technology. Domain complexity could come in the form of complex business rules for a product recommendation engine. Technology complexity often comes from the morass of backend systems needed to support the business. Creating a well-factored product recommendations service means that clients using the service don’t have to worry about either the domain complexity or the technical complexity of the backend data integrations. The complex business rules could be replaced by a machine learning algorithm in the future, and the consumers shouldn’t have to know about it, just get better recommendation results.
Figure 1: Product recommendations offer a business-truthful interface to shopping carts while hiding technical and domain complexity
Defining service boundaries: where to start?
Identifying service boundaries can happen when splitting a monolith into smaller services or designing brand new services. Identifying service boundaries in a monolith has a few advantages. Because of the effort spent building and maintaining the monolith, it is often fairly well-understood by the team — the value it provides to its consumers and its shortcomings. The team might have some good hypotheses around where the service boundaries ought to be, which can be a great starting point. Also, the monolith lends itself well to creating characterization tests that can be used to slowly pull services out of the monolith while guaranteeing functional correctness at the macro level.
On the other hand, defining service boundaries for a new business problem means you don’t have the same mental and technical baggage as when splitting services from a monolith. You could build the service ecosystem using the latest tools and technologies. On the downside, there is a high chance you will get the service boundaries wrong because the team doesn’t have sufficient understanding of the business problem or the traffic access patterns.
Regardless of where you start, it’s important to understand that modern software systems have vastly different expectations to those built in the past. They’re expected to: change much more often by the introduction of new features; scale to serve a much larger, geographically dispersed user base; be highly available, and provide very low latency responses. In essence, the expectations of today’s customers are shaped by a new reality which needs to be accounted for when designing service boundaries.
Highly cohesive and loosely coupled services
When designing services, it’s important to have the high cohesion of related concepts — things that change at the same time. Low (loose) coupling is about pulling unrelated concepts apart — things that can be built, tested, deployed, evolved and scaled independently. In a tightly coupled (tangled) system, introducing a simple change causes a ripple effect across the entire system; that can be very expensive.
The goal of highly cohesive, loosely coupled services is to minimize the ripple effect. This allows different parts of the system to evolve independently and become sophisticated over time, without necessarily complicating the overall system.
Guidelines for defining service boundaries
Respect transactional boundaries in business flows
This guideline speaks to the high cohesion principle. Service boundaries should align well to transactional boundaries in a business flow. Think of these as “business transactions” and not as persistence/storage technology-related transactions. The basic idea of transactions which is “all or nothing” still applies.
Let’s take an example. While working with an inventory system, there are two types of supply inventory. Current supply — inventory that’s available for sale at this moment in time; and future supply — inventory that will be available in the future.
Assuming that there are clients that are only interested in one or the other type of supply inventory, it might be tempting to split these two types into separate services. However, when supply transfers from future to current and vice versa, it’s important to make sure that it gets counted as one type or the other. Double counting could lead to overselling. This kind of “transfer” is what we mean by a business transaction.
We would recommend keeping both types of supply in the same service, to make it easier to manage the transactional nature of this transfer operation, especially when something goes wrong. In essence, start off simple by having a single service. When there’s a strong reason to separate them and a satisfactory solution to maintain the transactional nature of the transfer operation, then it might make sense to have two separate services.
Use “handoff” points in business flows to drive service boundaries
In our experience, in a typical long-running business process, there are rich interactions that happen in a part of the flow, and then there is a natural 'handoff' point to other parts of the system. Building service boundaries along those handoff points provide a nice clean interface between services. It helps to cluster all the rich interactions in a single service to tightly manage the experience around those interactions, such as failure conditions and latency issues. If these rich interactions were to be spread across multiple services, that would make for a very “chatty” service ecosystem resulting in performance degradation.
So for example, in the order processing flow, there are rich interactions that happen when a customer’s performing checkout such as product lookup, inventory lookup, product recommendations, payment processing and others. Once the order has been checked out, there’s a natural handoff between the customer-facing part of the order processing system and the backend fulfillment part of the order processing system. This can be a good boundary for services thus resulting in a checkout service and a fulfillment service.
Figure 2: Order management has a set of rich interactions related to the checkout process and then a natural handoff point for fulfillment
Manage the end user’s expectations around information consistency
This guideline speaks to the high cohesion principle. A user has certain expectations around information consistency when accessing a system. For example, when the user’s operating in the customer management system and they update the name, they expect the name to be updated in all the relevant places like user profile page and the home page at the same time. If that doesn’t happen, it would violate their expectations and result in distrust of the system.
It’s hard to keep information consistent across services. If the user profile and the home page were to be modeled as separate services with their separate data stores, there would be some delay in reflecting the name change which might be unacceptable.
When breaking up services, care should be taken to not violate the user’s expectations around information that should be strongly consistent, versus being eventually consistent. It’s important to understand the user’s expectations before you draw up the service boundaries, especially when carving up services from a monolith, where most of the updates lean on the side of being strongly consistent. If you intend to change the user’s expectations, make sure to communicate that to the user via training or provide appropriate informational messages in the system.
Respect semantic boundaries
This guideline speaks to the low coupling principle. Certain concepts have a different meaning in different contexts. So, for example, a purchase order is something that has a number of SKUs associated with it, which have been assigned to a vendor for manufacturing and will be delivered to a certain location on a future date. In a vendor management system, the purchase order is a legal contract to produce the goods. In an inventory system, the same purchase order indicates supply against which it can take future orders. To a warehouse management system, it’s about planning for space and labor for the incoming goods. The underlying concept is the same where as it gets modeled differently in different subdomains.
Service boundaries should allow for different interpretations of the same concept to best support the business’s needs for the respective subdomain. Rather than having a “global” purchase order model, instead, allow services to own and evolve parts of the purchase order relevant to their domain while having a shared identifier to uniquely identify the purchase order across domain boundaries. This is also referred to as contextualizing downstream.
Respect organizational boundaries
This guideline speaks to the low coupling principle. Sometimes the organizational boundaries are too hard to reach across, and hence it might be pragmatic to split services along organizational boundaries. For example, it may be desirable to have a single service to support promotions across various geographies of a retail company. However, that may not be feasible due to the organizational structure in place, where geographies are managed as separate business units and have dedicated development teams to support their needs in a timely fashion.
In this case, it may be pragmatic to create separate promotion services to give each geography the freedom to experiment with different promotions best suited to their needs. Care should be taken to ensure the customer experience isn’t negatively affected when that experience crosses geography boundaries.
Best tool for the job
There are certain problems that are solved very effectively using certain solutions, such as a specific language, framework or data store. For example, managing the social network of customers in the CRM domain can be quite efficient when backed by a graph data store. A graph data store models relationships as first-class entities and it becomes a lot easier and faster to traverse relationships. In such a situation, it would make sense to extract the social network portion of the CRM into a separate service to take advantage of the technology solution, in this case, a graph data store.
Factor in operational needs of the service
Services could be split based on their unique operational needs such as scaling, availability, latency or security needs. Care should be taken to not separate services based on technology or operational patterns as the primary factor. The above guidelines around transactional boundaries, semantic boundaries, and respecting users mental model will typically trump technology choices when designing the service boundary.
In summary, defining service boundaries is part intuition, part science. It takes knowledge of the domain and, when absent, passion for learning the business. The key is to have an intense focus on understanding the problem space before jumping into solution space.
Once you understand the problem space well enough, you understand the natural boundaries in the domain — the boundaries that provide the right level of isolation. When in doubt, err on the side of larger services than smaller services until you understand the domain well enough. Defining service boundaries is an iterative process. Build something valuable at each increment and iterate until you find the right granularity. It is highly recommended to have a strong Continuous Delivery pipeline which can be leveraged to make changes quickly when you get the service boundaries wrong.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.