Decoder

Data mesh

Data mesh is an approach to data architecture where ownership of the data is distributed among cross-functional domain teams, who then provide data products to end users.

A data mesh decentralizes ownership of data, challenging traditional thinking that assumed centralized data ownership — such as data warehouses — worked best. Data mesh solves some of the quality and scaling issues that have plagued data warehouses and lakes.

What is it?

A distributed approach to data architecture, where cross-functional domain teams own ‘data products’, which they make data available to the enterprise in a self-serve manner.

Learn more

What’s in it for you?

Data mesh solves some of the quality and scaling issues that have plagued centralized approaches to data analytics.

Learn more

What are the trade-offs?

Data mesh is complex.

Learn more

How is it being used?

Data mesh is suited to organizations with complex requirements for data analytics across multiple different teams and systems.

Learn more

What is it?

A data mesh is simply a distributed or federated approach to architecting a data platform and the teams around it. It removes the traditional bottlenecks caused by structuring your data platform in a centralized manner.

Data meshes advance the state of the art while drawing on lessons learned from modern distributed architectures and platform thinking. As online presence, including IoT, expanded rapidly over the last 15 years, the importance of data and good data management has grown exponentially also. Data mesh is a new approach to managing this vast collection of data, collections that need new architectures and tools to manage them properly.

Conventional wisdom dictated that big analytical data needed to be centralized to use it, that data needs to all be in one place or be managed by a centralized data team to deliver value. As enterprises face increasing pressure to make sense of ever-increasing volumes of data in near real time, such centralized approaches are no longer suitable.

Data mesh suggests that domain experts are best placed to know how to derive value from their data. They’re encouraged to treat their data as products that can be delivered — in a self-service manner — to the rest of the enterprise. Then other data product teams can leverage such data to generate their own insights, build intelligent applications or aggregate data in new ways for others to consume within the platform.

What’s in for you?

Data mesh tackles some of the quality and scaling issues that have plagued other, centralized approaches. Done right, data mesh enables your enterprise to draw insights from big data and fueling innovation in your use of data analytics. Your teams will have the freedom to explore new ways of putting data to the best use.

Data mesh approach suggests that cross-functional domain teams are best placed to own that data and enhance its usefulness. That ownership — and the requirement to make the data available to the wider enterprise, as products — ensures proper curation of that data.

What are the trade offs?

Data mesh is complex and requires a shift in mindset to encourage domain teams to own their data and make it available in a readily consumable way. You’ll need to adopt some form of federated governance to ensure those domain-owned data products can be shared across the enterprise.

Because your data will be owned by domain teams, you’ll no longer have a single central entity that owns customer data — instead, customer data has to be built from the various data products created across the enterprise. You may need some sort of incentive structure that rewards domains for serving and utilizing data as a product — and that requires explicit organizational design to support this.

How is it being used?

Data mesh is ideally suited to enterprises that have large data sets with many domains, teams and systems. We’re seeing increasing interest in sectors such as retail, health and insurance, where organizations have many disparate sources of data and an urgent need to be able to derive value from that, as well a need to scale the platform to generate more insights faster, while onboarding and creating new datasets.