This blip is not on the current edition of the Radar. If it was on one of the last few editions it is likely that it is still relevant. If the blip is older it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the RadarUnderstand more
Published: May 19, 2020
May 2020

There are still some tool gaps when applying good software engineering practices in data engineering. Attempting to automate data quality checks between different steps in a data pipeline, one of our teams was surprised when they found only a few tools in this space. They settled on Deequ, a library for writing tests that resemble unit tests for data sets. Deequ is built on top of Apache Spark, and even though it's published by AWS Labs it can be used in environments other than AWS.