Sensible defaults for CD4ML

David Tan

Published: June 11, 2021

Applying continuous delivery in machine learning (CD4ML) projects is hard, for a few reasons:

Two worlds (software and data) have collided in recent years, and it takes time and experience for data practitioners to adopt continuous delivery principles and practices (and vice versa!)
Data tools and platforms are shipped so quickly by cloud providers, and they often focus on storage and compute, leaving CI/CD practices (e.g. unit testing, test data management) as second-class considerations to be figured out by teams
It’s easy to choose a tool or platform, and find ourselves locked in and limited by the tool’s API

In our experience, we use CD4ML sensible defaults as a north star to help us navigate through this chaotic environment. Instead of looking for a single data platform as silver bullet, we’ve had greater success by:

Composing implementations from first principles (such as automated testing, shifting quality left, post-deployment monitoring, etc.)
Preferring composition over monolithic platforms

If you’re interested in discussing how we could help you on this journey, or want to chat about how you’re tackling them, we’d love to hear from you!

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.

Industries

Publications and Tools

All Insights

Sensible defaults for CD4ML

Related blogs

Want help to unlock your data potential?