Announcement: Creating Big Data Solutions with Impala

Blog Summary: (AI Summaries by Summarizes)
  • The author's latest screencast on Apache Impala called "Creating Big Data Solutions with Impala" was released on O'Reilly.
  • The author's relationship with Impala started when he joined Cloudera and offered to help with various projects, including Impala.
  • The author created Cloudera's first public VM image to help people use Impala, which eventually became Cloudera Quickstart VM.
  • The author was present at the first Impala session at Strata NYC in 2012, which was standing room only and indicated that Impala was going to be a big deal.
  • Impala filled a need for Big Data solutions, particularly for those with only SQL skills who needed a faster alternative to Hive and a great JDBC connector for their BI tools.

I am proud to announce that my latest screencast on Apache Impala called Creating Big Data Solutions with Impala was released on O’Reilly. This caps off a long relationship with Impala that started well before it was released publicly.

My relationship with Impala started off when I first joined Cloudera. I started learning about the various initiatives at the company. I met the people who were responsible for the projects and offered to help them. One of those projects was Impala.

Cloudera announced Impala at Strata NYC in 2012. To help people use Impala, I created Cloudera’s first public VM (virtual machine) image. This eventually morphed and coalesced into Cloudera Quickstart VM. To make things even more full circle, the screencast uses the Quickstart VM to run Impala on your machine.

I was there when the first Impala session was done at Strata NYC 2012. It was a packed room. Many people say their sessions are packed, but this was standing room only and we were in the biggest session room. There were people lining the aisles. I knew Impala was going to be a big deal.

I knew Impala filled a big need for Big Data solutions. I was teaching at the Fortune 100 companies with people who only had SQL skills. They needed a faster alternative to Hive. They needed something with a great JDBC connector for their BI tools.

Before I left Cloudera, Tom Wheeler and I wrote the definitive course on building Big Data applications. It was called Designing and Building Big Data Applications and covered Impala. We used Impala’s JDBC connector to create an example customer service application. It showed how many different parts of the Hadoop ecosystem all work together.

When Cloudera released Impala, it was Apache licensed from the beginning. I came from closed source companies. When I first heard Mike Olson say we were going to open source Impala from beginning, it blew my mind. In late 2015, Impala became even more open when it was donated to the Apache Foundation.

I invite you to learn more about Apache Impala and join the vast list of companies using Impala in production.

Related Posts

The Difference Between Learning and Doing

Blog Summary: (AI Summaries by Summarizes)Learning options trading involves data and programming but is not as technical as data engineering or software engineering.Different types of

The Data Discovery Team

Blog Summary: (AI Summaries by Summarizes)Data discovery team plays a crucial role in searching for data in the IT landscape.Data discovery team must make data