Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Last updated : Nov 07, 2016
不在本期内容中
这一条目不在当前版本的技术雷达中。如果它出现在最近几期中,那么它很有可能仍然具有相关参考价值。如果这一条目出现在更早的雷达中,那么它很有可能已经不再具有相关性,我们的评估将不再适用于当下。很遗憾我们没有足够的带宽来持续评估以往的雷达内容。 了解更多
Nov 2016
试验 ? 值得一试。了解为何要构建这一能力是很重要的。企业应当在风险可控的前提下在项目中尝试应用此项技术。

A Data Lake is an immutable data store of largely unprocessed "raw" data, acting as a source for data analytics. While the technique can clearly be misused, we have used it successfully at clients, hence motivating its move to trial. We continue to recommend other approaches for operational collaborations, limiting the use of the data lake to reporting, analytics and feeding data into data marts.

Apr 2016
试验 ? 值得一试。了解为何要构建这一能力是很重要的。企业应当在风险可控的前提下在项目中尝试应用此项技术。
Nov 2015
评估 ? 在了解它将对你的企业产生什么影响的前提下值得探索

A Data Lake is an immutable data store of largely unprocessed 'raw' data, acting as a source for data analytics. Whereas the more familiar Data Warehouse filters and processes the data before storing it, the lake just captures the raw data, leaving it to the users of that data to carry out the particular analysis that they need. Examples include HDFS or HBase within a Hadoop, Spark or Storm processing framework. Usually only a small group of data scientists work on the raw data, developing streams of processed data into lakeshore data marts for most users to query. A Data Lake should only be used for analytics and reporting. For collaboration between operational systems we prefer using services designed for that purpose.

May 2015
评估 ? 在了解它将对你的企业产生什么影响的前提下值得探索

An Enterprise Data Lake is an immutable data store of largely un-processed “raw” data, acting as a source for other processing streams but also made directly available to a significant number of internal, technical consumers using some efficient processing engine. Examples include HDFS or HBase within a Hadoop, Spark or Storm processing framework. We can contrast this with a typical system that collects raw data into some highly restricted space that is only made available to these consumers as the end result of a highly controlled ETL process.

Embracing the concept of the data lake is about eliminating bottlenecks due to lack of ETL developer staffing or excessive up front data model design. It is about empowering developers to create their own data processing pipelines in an agile fashion when they need it and how they need it—within reasonable limits—and so has much in common with another model that we think highly of, the DevOps model.

Jan 2015
评估 ? 在了解它将对你的企业产生什么影响的前提下值得探索
已发布 : Jan 28, 2015
Radar

下载第25期技术雷达

English | Español | Português | 中文

Radar

获取最新技术洞见

 

立即订阅

查看存档并阅读往期内容