How much do companies lose before training?

Jesse Anderson
January 11, 2017
Blog, Business, Data Engineering, Data Engineering is hard
No Comments

Blog Summary: (AI Summaries by Summarizes)

Starting to write code or design a solution before receiving proper training is a bad idea, especially in Big Data.
Making a mistake with small data isn't costly and can be fixed quickly, but making a mistake with Big Data is very costly and can take a while to fix.
Companies who start coding before being trained waste an average of $100,000 to $200,000, and this number can go as high as $1,000,000 to $1,500,000 for companies that waited months before being trained.
Training saves money by avoiding bad ideas or abuses of technology that can turn into major problems and wastes of money down the road.
The average cost of hypothetical "what if" scenarios due to not receiving training is $300,000 to $400,000, based on downtime estimates, extra operations time, and code rewrites.

Sometimes companies will start writing code or designing a solution before I train there. This is usually a bad idea. It really shows the difference between Big Data and small data. Making a mistake with small data isn’t costly and doesn’t take long to fix. Making a mistake with Big Data is very costly and can take a while to fix.

Companies who start coding before they’ve been trained waste an average of $100,000 to $200,000. I’ve seen this number go as high as $1,000,000 to $1,500,000 for companies that waited months before being trained. For them, training was a way to get out of a deep hole.

These numbers are based on my conversations with the engineers about how much time was spent already, how much time they’ll have to spend fixing things, and the opportunity cost. I’ve written extensively about how training saves you money.

The numbers you just read are only the numbers for wasted time up to that point. They don’t cover the hypothetical “what if” they didn’t receive the training. While I’m training a team, I’m paying attention to any bad ideas or abuses of a technology. These are the genesis for major problems down the road. These major problems turn into major wastes of money down the road. The average for this is $300,000 to $400,000.

These numbers are based on downtime estimates, extra operations time, and rewrites of code.

What If

Let me give you an example of a company that avoided a “what if” scenario. I was training at a company on real-time distributed systems. They were going to do a real-time, non-time bounded join. That means two streams would be joined in real-time, but the two streams weren’t temporally in-sync. It could take an hour or 12 hours for the other message to come through the system. This scenario is possible, but it was over-engineered and operationally fragile.

In talking to the engineer, I found a much simpler and less operationally intense method. It still satisfied all of the requirements. The engineer had spent a month solid writing that code. The operations costs would have been weeks of time from diagnosing weird problems to outright downtime from the system not working.

My $25,000 in training saved that company at least $400,000. Had they come to me before starting it would have been at least $500,000. I’ll take ROI like that anytime.

If you’re still looking at those numbers and thinking it isn’t possible, you’re still thinking in small data terms. Due to its sheer complexity, a mistake or outright misunderstanding of Big Data technologies is costly.

If you’re starting on a Big Data project or wanting to become a Data Engineer, I strongly urge you to get training. Otherwise, you’ll be risking hundreds of thousands of dollars.

How much do companies lose before training?

What If

Related Posts

Unapologetically Technical Episode 10 – Michael Drogalis

Why Most Data Projects Fail & How to Avoid It at GOTO 2023

Unapologetically Technical Episode 9 – Gunnar Morling

Unapologetically Technical Episode 8 – Tom Scott

The State of Data Engineering at Data Day Texas 2024

Unapologetically Technical Episode 7 – Stephane Derosiaux

The Difference Between Learning and Doing

Unapologetically Technical Episode 6 – Matteo Merli

The Data Discovery Team

Join the Newsletter