Apache Kafka and Amazon Kinesis

Apache Kafka and Amazon Kinesis

by Jesse Anderson | Jul 12, 2017 | Blog, Business, Data Engineering, Data Engineering is hard | 3 comments

This post will focus on the key differences a Data Engineer or Architect needs to know between Apache Kafka and Amazon Kinesis. Cloud vs DIY Some of the contenders for Big Data messaging systems are Apache Kafka, Amazon Kinesis, and Google Cloud Pub/Sub (discussed in...

The Blame Game

by Jesse Anderson | Jul 5, 2017 | Blog, Business, Data Engineering, Data Engineering is hard | 0 comments

When a Big Data project fails, there’s plenty of blame to go around. When I do the retrospectives with teams who are failing or about to fail, their blame is often misplaced. There’s a focus on blaming the technology. The more difficult considerations of...
Apache Kafka and Amazon Kinesis

Medium Data

by Jesse Anderson | Jun 21, 2017 | Blog, Business, Data Engineering, Data Engineering is hard | 6 comments

Most companies aren’t experiencing Big Data or small data problems. They’re experiencing a witching hour of sorts. This a point in their growth where their data is too big for small data and too small for Big Data. As I’m teaching at companies,...
The Difficulty of Transitioning to Data Pipelines

The Difficulty of Transitioning to Data Pipelines

by Jesse Anderson | Jun 7, 2017 | Blog, Business, Data Engineering, Data Engineering is hard | 0 comments

There’s a common difficulty that companies are having in transitioning to Big Data, especially Kafka. They’re coming from systems where everything is exposed as an RPC-esque call (remote procedure call/REST call/etc). They’re transitioning to a data...

Five Dysfunctions of a Data Engineering Team

by Jesse Anderson | May 31, 2017 | Blog, Business, Data Engineering, Data Engineering is hard | 0 comments

At Strata London, I premiered a new talk based on my Data Engineering Teams book. Companies are seeing great efficiency gains and ROI from using Big Data technologies. However, the vast majority of teams fail and never get something into production. I want to prevent...
Apache Kafka and Amazon Kinesis

Kafka Topic Design Checklist

by Jesse Anderson | May 17, 2017 | Blog, Business, Data Engineering, Data Engineering is hard | 0 comments

Designing data for consumption in a Kafka topic requires more forethought. Instead of the messages being a consumed from point to point, there are many different consumers. You will need to decide on: Name Schema Contents Key/Ordering Number of Partitions Number of...
« Older Entries
Next Entries »
JA Footer Icon
Twitter Linkedin Rss

© Jesse Anderson 2022

Join the Newsletter
Jesse Anderson signature