Master

本页面中的信息并不完全以您的首选语言展示,我们正在完善其他语言版本。想要以您的首选语言了解相关信息,可以点击这里下载PDF。

工具

Airflow

NOT ON THE CURRENT EDITION
This blip is not on the current edition of the Radar. If it was on one of the last few editions it is likely that it is still relevant. If the blip is older it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the RadarUnderstand more
Published: Mar 29, 2017
Last Updated: Oct 28, 2020
Oct 2020
采纳?

Airflow仍然是我们广泛采用的最喜欢的开源工作流管理工具,用于构建作为有线无环图(DAGs)的数据处理流水线。这是一个蓬勃发展的领域,开源工具有 LuigiArgo,厂商工具则有 Azure Data Factory 或者 AWS Data Pipeline。然而 Airflow 特别之处在于它对工作流的程序化定义,而非低代码配置文件,以及对自动化测试的支持,开源并支持多平台,对数据生态丰富的集成点还有广泛的社区支持。不过在像数据网格这样的去中心化数据架构中,Airflow 的劣势在于它是一个中心化的工作流编排。

Mar 2017
试验?

Airflow is a tool to programmatically create, schedule and monitor data pipelines. By treating Directed Acyclic Graphs (DAGs) as code, it encourages maintainable, versionable and testable data pipelines. We've leveraged this configuration in our projects to create dynamic pipelines that resulted in lean and explicit data workflows. Airflow makes it easy to define your operators and executors and to extend the library so that it fits the level of abstraction that suits your environment.