Cloudera Impala

This blip is not on the current edition of the radar. If it was on one of the last few editions it is likely that it is still relevant. If the blip is older it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the radarUnderstand more
Published: May 05, 2015
Last Updated: Nov 10, 2015
Nov 2015

For a while now the Hadoop community has been trying to bring low-latency, interactive SQL capability to the Hadoop platform (better known as SQL-on-Hadoop). This has led to a few open source systems such as Cloudera Impala, Apache Drill, Facebook’s Presto etc being developed actively through 2014. We think the SQL-on-Hadoop trend signals an important shift as it changes Hadoop's proposition from being a batch oriented technology that was complementary to databases into something that could compete with them.  Cloudera Impala was one of the first SQL-on-Hadoop platforms. It is a distributed, massively-parallel, C++ based query engine. The core component of this platform is the Impala daemon that coordinates the execution of the SQL query across one or more nodes of the Impala cluster. Impala is designed to read data from files stored on HDFS in all popular file formats. It leverages Hive's metadata catalog, in order to share databases and tables between the two database platforms. Impala comes with a shell as well as JDBC and ODBC drivers for applications to use. 

May 2015