Menü
Tools

Great Expectations

Published: Oct 28, 2020
Last Updated: Apr 13, 2021
Apr 2021
Trial?

We wrote about Great Expectations in the previous edition of the Radar. We continue to like it and have moved it to Trial in this edition. Great Expectations is a framework that enables you to craft built-in controls that flag anomalies or quality issues in data pipelines. Just as unit tests run in a build pipeline, Great Expectations makes assertions during execution of a data pipeline. We like its simplicity and ease of use — the rules stored in JSON can be modified by our data domain experts without necessarily needing data engineering skills.

Oct 2020
Assess?

With the rise of CD4ML, operational aspects of data engineering and data science have received more attention. Automated data governance is one aspect of this development. Great Expectations is a framework that enables you to craft built-in controls that flag anomalies or quality issues in data pipelines. Just as unit tests run in a build pipeline, Great Expectations makes assertions during execution of a data pipeline. This is useful not only for implementing a sort of Andon for data pipelines but also for ensuring that model-based algorithms remain within the operating range determined by their training data. Automated controls like these can help distribute and democratize data access and custodianship. Great Expectations also ships with a profiler tool to help understand the qualities of a particular data set and to set appropriate limits.