This blip is not on the current edition of the Radar. If it was on one of the last few editions it is likely that it is still relevant. If the blip is older it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the RadarUnderstand more
Published: Mar 29, 2017
Last Updated: Oct 28, 2020
Oct 2020

Airflow remains our most widely used and favorite open-source workflow management tool for data-processing pipelines as directed acyclic graphs (DAGs). This is a growing space with open-source tools such as Luigi and Argo and vendor-specific tools such as Azure Data Factory or AWS Data Pipeline. However, Airflow differentiates itself with its programmatic definition of workflows over limited low-code configuration files, support for automated testing, open-source and multiplatform installation, rich set of integration points to the data ecosystem and large community support. In decentralized data architectures such as data mesh, however, Airflow currently falls short as a centralized workflow orchestration.

Mar 2017

Airflow is a tool to programmatically create, schedule and monitor data pipelines. By treating Directed Acyclic Graphs (DAGs) as code, it encourages maintainable, versionable and testable data pipelines. We've leveraged this configuration in our projects to create dynamic pipelines that resulted in lean and explicit data workflows. Airflow makes it easy to define your operators and executors and to extend the library so that it fits the level of abstraction that suits your environment.