Jesse+ by | Oct 24, 2018 | Blog, Business, Data Engineering, Data Engineering is hard |
There is a common misunderstanding in data engineering that you can do everything you need to create a Big Data data pipeline with SQL. This notion is being promoted by some vendors and companies. They’re wrong and you can’t do all of your data engineering with SQL....
Jesse+ by | Aug 29, 2018 | Blog, Business, Data Engineering, Data Engineering is hard |
A common use case for using Kafka and Pulsar is to create work queues. The two technologies offer different implementations for accomplishing this use case. I’ll discuss the ways of implementing work queues in Kafka and Pulsar as well as the relative strengths of...
Jesse+ by | Aug 15, 2018 | Blog, Business, Data Engineering |
Here is my keynote from InfiniteConf 2018. I talk about why real-time is gaining so much momentum, what it does for businesses, how it helps data sciences, and some common use cases.
Jesse+ by | Aug 1, 2018 | Blog, Business, Data Engineering
I’ve been seeing some questions about data pipelines lately. I realized I haven’t written a post that gives the level of detail necessary for a good definition of a data pipeline in the context of data engineering. Instead of just giving my opinion, I’ve brought...
Jesse+ by | Jul 25, 2018 | Blog, Business, Data Engineering |
Note: this is a guest post from Sanjoy Roy who is reviewing my Professional Data Engineering course. Since late 2014, I have been drawn into various analytics projects which required a good mix of skills for both data engineering and data science. There are a lot of...
Jesse+ by | Jul 18, 2018 | Blog, Business, Data Engineering |
There is a common beginner question for engineers starting out with Big Data. An engineer will do a post to a social media site saying “I need to know which Big Data technology to use. I have 3 billion rows in 10,000 files. The whole dataset is 100 GB. Is Big Data...