Last week, I gave two talks about Strata+Hadoop World. These talks covered some of the up and coming technologies in Big Data. I describe Strata as the Super Bowl of Big Data conferences. This is where youâ€™ll find the best minds talking about the present and future conditions of Big Data.
My second session covered how Apache Spark and Java can be used together. There isnâ€™t a great deal of material on using Spark and Java together. All of my classes teach how to use Spark only using Java. Unless there is a big need for dynamic languages in the use case, I donâ€™t see the need for teams to learn Scala.
The march towards real-time Big Data continues. I spoke about Kafka at last yearâ€™s Strata+Hadoop World. This year, weâ€™re seeing more representation with Apache Flink and Apache Apex. Data Engineers will be sure to keep up-to-date on the latest changes in real-time frameworks.
Weâ€™re also seeing more productization of Big Data use cases. There are mature products targeting specific IT uses cases that require Big Data.
On the Apache Beam side, I noticed an uptick in early adopters looking it. Many data engineering teams arenâ€™t looking to rewrite code as they more from framework to framework. Data Scientists are looking for a single API to do their programming and analysis with.
Iâ€™m also seeing some broader trends in Big Data. People are starting to agree with my assertions that business value must be established before embarking on a Big Data project. Management teams need to be training just as much as technical teams.
Gartner wrote an article talking the rise of the data executive. This is a C-level position that companies are putting in place. The title is often Chief Data Officer or Chief Analytics Officer. Companies are finally putting data in the C-suite.
ZDNET wrote an article talking about why Big Data projects fail. While I disagree that the Big Data boom is over, I agree that Big Data projects fail for specific reasons. Iâ€™ve seen these issues repeated over and over in companies. To help companies, Iâ€™ve created curriculum around teaching business leaders about Big Data projects and why theyâ€™re different than small data ones. I teach how data engineering teams should be run and the people that should be on them.
Big Data projects fail due to a lack of business value, lack of training, and having the wrong people on the team.
Photo by Alex Moundalexis