It’s been fascinating watching the operational world change over the years. We started out by racking and stacking anything that needed to run. We wisened up a bit and started using virtual machines. More recently we’ve moved into containerization. Surely,...
Small data engineering teams require different tactics. Much of my writing is geared towards larger companies and teams. How should a startup or small data engineering team in a big company be set up and work? What, if anything, should be done different? Your First...
At DataEngConf Barcelona, I premiered a new talk about the importance of creating a data engineering culture. I share what a data engineering culture is and what management needs to do to be successful with Big Data. You can download the slides from the talk here and...
There is a common misunderstanding in data engineering that you can do everything you need to create a Big Data data pipeline with SQL. This notion is being promoted by some vendors and companies. They’re wrong and you can’t do all of your data engineering...
A common use case for using Kafka and Pulsar is to create work queues. The two technologies offer different implementations for accomplishing this use case. I’ll discuss the ways of implementing work queues in Kafka and Pulsar as well as the relative strengths...
Here is my keynote from InfiniteConf 2018. I talk about why real-time is gaining so much momentum, what it does for businesses, how it helps data sciences, and some common use cases. Can you switch careers to Big Data in 4 months or less?If you’re a Software Engineer...