Strata+Hadoop World and Trends

Jesse Anderson
October 5, 2016
Blog, Business, Data Engineering
No Comments

Blog Summary: (AI Summaries by Summarizes)

Strata+Hadoop World is the Super Bowl of Big Data conferences where the best minds talk about the present and future conditions of Big Data.
The first session covered Apache Beam and some of the interesting features for Big Data.
The second session covered how Apache Spark and Java can be used together.
The march towards real-time Big Data continues with more representation with Apache Flink and Apache Apex.
There are mature products targeting specific IT use cases that require Big Data.

Last week, I gave two talks about Strata+Hadoop World. These talks covered some of the up and coming technologies in Big Data. I describe Strata as the Super Bowl of Big Data conferences. This is where you’ll find the best minds talking about the present and future conditions of Big Data.

My first session was a tutorial with Tyler Akidau from Google. We covered Apache Beam and some of the interesting features for Big Data.

My second session covered how Apache Spark and Java can be used together. There isn’t a great deal of material on using Spark and Java together. All of my classes teach how to use Spark only using Java. Unless there is a big need for dynamic languages in the use case, I don’t see the need for teams to learn Scala.

Strata Trends

The march towards real-time Big Data continues. I spoke about Kafka at last year’s Strata+Hadoop World. This year, we’re seeing more representation with Apache Flink and Apache Apex. Data Engineers will be sure to keep up-to-date on the latest changes in real-time frameworks.

We’re also seeing more productization of Big Data use cases. There are mature products targeting specific IT uses cases that require Big Data.

On the Apache Beam side, I noticed an uptick in early adopters looking it. Many data engineering teams aren’t looking to rewrite code as they more from framework to framework. Data Scientists are looking for a single API to do their programming and analysis with.

Broader Trends

I’m also seeing some broader trends in Big Data. People are starting to agree with my assertions that business value must be established before embarking on a Big Data project. Management teams need to be training just as much as technical teams.

Gartner wrote an article talking the rise of the data executive. This is a C-level position that companies are putting in place. The title is often Chief Data Officer or Chief Analytics Officer. Companies are finally putting data in the C-suite.

ZDNET wrote an article talking about why Big Data projects fail. While I disagree that the Big Data boom is over, I agree that Big Data projects fail for specific reasons. I’ve seen these issues repeated over and over in companies. To help companies, I’ve created curriculum around teaching business leaders about Big Data projects and why they’re different than small data ones. I teach how data engineering teams should be run and the people that should be on them.

Big Data projects fail due to a lack of business value, lack of training, and having the wrong people on the team.

Photo by Alex Moundalexis

Strata+Hadoop World and Trends

Strata Trends

Broader Trends

Related Posts

Gemini Batch API for Java

Unapologetically Technical Episode 20 – Shane Murray

Unapologetically Technical Episode 19 – Jacopo Tagliabue

Unapologetically Technical Episode 18 – Adrian Woodhead

Unapologetically Technical Episode 17 – Semih Salihoglu

Unapologetically Technical Episode 16 – David Jayatillake

Unapologetically Technical Episode 15 – Frances Perry

Unapologetically Technical Episode 14 – Cliff Crosland

Data Teams Survey 2020-2024 Analysis

Join the Newsletter