Why Real-time is the Future

Blog Summary: (AI Summaries by Summarizes)
  • Real-time Big Data is becoming increasingly important for organizations, teams, and individuals.
  • However, in the past, we lacked the systems that could scale to the sizes and amounts of data needed for real-time processing.
  • As a result, many organizations had to resort to batch processing over 24-hour windows, which didn't meet the needs of the business.
  • Some teams tried to lower their batch sizes to be smaller and smaller time windows, but this caused operational headaches and the systems couldn't keep up with the demand.
  • The business wanted to be no more than a minute behind what's currently happening, which was impossible with batch processing.

One of the benefits of teaching and consulting is the sheer number of organizations, teams, and people I get to work with. Since I deal with so many different groups, I can see patterns emerge much faster than others.

One pattern I saw early on was real-time Big Data. Organizations wanted to do things in real-time. Teams had projects that required real-time. People had ideas the required real-time systems.

And we couldn’t do it

We lacked the systems that could scale to the sizes and amounts of data needed. As a direct result, we had to do terrible workarounds.

As I work with my clients around the world, they’re moving from batch processing to real-time processing. They tell me the stories about how they wanted to do real-time, but could only approximate the system in batch.

Let me share one of their stories.

One large financial company was feeling the need for real-time processing. The use case required real-time, but the project was started at a time when real-time Big Data wasn’t feasible. As a result, they had to go with batch processing over 24 hour windows. This didn’t meet the needs of the business, but that was that was possible.

They would try to lower their batch sizes to be smaller and smaller time windows. What started as a 24 hour batch window gradually decreased down to 30-60 minutes. The business was all over the team to turn around the data faster and faster. It wasn’t acceptable to be 24 hours behind.

But the team couldn’t go any lower than 30-60 minutes. Going to that small of a batch window caused all sorts of operational headaches. The systems just couldn’t keep up with the demand and the operations team crumbled.

The business wanted to be no more than a minute behind what’s currently happening. There was nothing more that the team could do. They had to move to a real-time system.

I mentored the team on their transition to real-time. They could finally accomplish their original use case and its requirements.

Now we have the systems that can scale and do real-time Big Data.

As I work with these teams on their moves to real-time, they’re able to circle back with the business and actually deliver. This is the part that I love. I love being able to remove the pain that a web of terrible workarounds causes and producing a resilient real-time system.

This is why real-time is the future. The business and use case wanted real-time processing. We as data engineers can deliver it now.

Related Posts

The Difference Between Learning and Doing

Blog Summary: (AI Summaries by Summarizes)There are several types of learning videos: hype, low effort, novice, and professional.It is important to avoid hype, low-effort, and

The Data Discovery Team

Blog Summary: (AI Summaries by Summarizes)The concept of a “data discovery team” is introduced, which focuses on searching for data in an enterprise data reality.Data

Black and white photo of three corporate people discussing with a view of the city's buildings

Current 2023 Announcements

Blog Summary: (AI Summaries by Summarizes)Confluent’s Current Conference featured several announcements that are important for both technologists and investors.Confluent has two existing moats (replication and

zoomed in line graph photo

Data Teams Survey 2023 Follow-Up

Blog Summary: (AI Summaries by Summarizes)Many companies, regardless of size, are using data mesh as a methodology.Smaller companies may not necessarily need a data mesh

Laptop on a table showing a graph of data

Data Teams Survey 2023 Results

Blog Summary: (AI Summaries by Summarizes)A survey was conducted between January 24, 2023, and February 28, 2023, to gather data for the book “Data Teams”

Black and white photo of three corporate people discussing with a view of the city's buildings

Analysis of Confluent Buying Immerok

Blog Summary: (AI Summaries by Summarizes)Confluent has announced the acquisition of Immerok, which represents a significant shift in strategy for Confluent.The future of primarily ksqlDB

Tall modern buildings with the view of the ocean's horizon

Brief History of Data Engineering

Blog Summary: (AI Summaries by Summarizes)Google created MapReduce and GFS in 2004 for scalable systems.Apache Hadoop was created in 2005 by Doug Cutting based on