The Three Components of a Big Data Data Pipeline

The Three Components of a Big Data Data Pipeline

by Jesse Anderson | Jan 16, 2019 | Blog, Business | 0 comments

The Three Components of a Big Data Data Pipeline There’s a common misconception in Big Data that you only need 1 technology to do everything that’s necessary for a data pipeline – and that’s incorrect. Data Engineering != Spark The...
Advice for Small Teams and Startups on Data Engineering

Advice for Small Teams and Startups on Data Engineering

by Jesse Anderson | Dec 19, 2018 | Blog, Business, Data Engineering | 0 comments

Small data engineering teams require different tactics. Much of my writing is geared towards larger companies and teams. How should a startup or small data engineering team in a big company be set up and work? What, if anything, should be done different? Your First...

Creating a Data Engineering Culture

by Jesse Anderson | Nov 7, 2018 | Blog, Business, Data Engineering, Data Engineering is hard | 0 comments

At DataEngConf Barcelona, I premiered a new talk about the importance of creating a data engineering culture. I share what a data engineering culture is and what management needs to do to be successful with Big Data. You can download the slides from the talk here and...
Why You Can’t Do All of Your Data Engineering with SQL

Why You Can’t Do All of Your Data Engineering with SQL

by Jesse Anderson | Oct 24, 2018 | Blog, Business, Data Engineering, Data Engineering is hard | 0 comments

There is a common misunderstanding in data engineering that you can do everything you need to create a Big Data data pipeline with SQL. This notion is being promoted by some vendors and companies. They’re wrong and you can’t do all of your data engineering...
Thoughts on Cloudera Merging/Buying Hortonworks

Thoughts on Cloudera Merging/Buying Hortonworks

by Jesse Anderson | Oct 8, 2018 | Blog, Business | 3 comments

Cloudera has merged with/purchased Hortonworks. As a former Clouderan, it’s interesting to see this move on several levels. I’m going to share my insights from the outside as a former insider. Full Disclosure: Although I’m former Cloudera, I...
The Three Components of a Big Data Data Pipeline

Creating Work Queues with Apache Kafka and Apache Pulsar

by Jesse Anderson | Aug 29, 2018 | Blog, Business, Data Engineering, Data Engineering is hard | 2 comments

A common use case for using Kafka and Pulsar is to create work queues. The two technologies offer different implementations for accomplishing this use case. I’ll discuss the ways of implementing work queues in Kafka and Pulsar as well as the relative strengths...
« Older Entries
Next Entries »
JA Footer Icon
Twitter Linkedin Rss

© Jesse Anderson 2022

Join the Newsletter
Jesse Anderson signature