Enable javascript in your browser for better experience. Need to know to enable it? Go here.

Azure Data Factory for orchestration

Last updated : Mar 29, 2022
NOT ON THE CURRENT EDITION
This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more
Mar 2022
Hold ?

For organizations using Azure as their primary cloud provider, Azure Data Factory is currently the default for orchestrating data-processing pipelines. It supports data ingestion, copying data from and to different storage types on prem or on Azure and executing transformation logic. Although we've had adequate experience with Azure Data Factory for simple migrations of data stores from on prem to the cloud, we discourage the use of Azure Data Factory for orchestration of complex data-processing pipelines and workflows. We've had some success with Azure Data Factory when it's used primarily to move data between systems. For more complex data pipelines, it still has its challenges, including poor debuggability and error reporting; limited observability as Azure Data Factory logging capabilities don't integrate with other products such as Azure Data Lake Storage or Databricks, making it difficult to get an end-to-end observability in place; and availability of data source-triggering mechanisms only to certain regions. At this time, we encourage using other open-source orchestration tools (e.g., Airflow) for complex data pipelines and limiting Azure Data Factory for data copying or snapshotting. Our teams continue to use Data Factory to move and extract data, but for larger operations we recommend other, more well-rounded workflow tools.

Nov 2019
Hold ?

Azure Data Factory (ADF) is currently Azure's default product for orchestrating data-processing pipelines. It supports data ingestion, copying data from and to different storage types on prem or on Azure and executing transformation logic. While we've had a reasonable experience with ADF for simple migrations of data stores from on prem to cloud, we discourage the use of Azure Data Factory for orchestration of complex data-processing pipelines. Our experience has been challenging due to several factors, including limited coverage of capabilities that can be implemented through coding first, as it appears that ADF is prioritizing enabling low-code platform capabilities first; poor debuggability and error reporting; limited observability as ADF logging capabilities don't integrate with other products such as Azure Data Lake Storage or Databricks, making it difficult to get an end-to-end observability in place; and availability of data source-triggering mechanisms only to certain regions. At this time, we encourage using other open-source orchestration tools (e.g., Airflow) for complex data pipelines and limit ADF for data copying or snapshotting. We're hoping that ADF will address these concerns to support for more complex data-processing workflows and prioritize access to capabilities through code first.

Published : Nov 20, 2019

Download the PDF

 

 

 

English | Español | Português | 中文

Sign up for the Technology Radar newsletter

 

Subscribe now

Visit our archive to read previous volumes