How Are Programming and Distributed Systems Different?

Blog Summary: (AI Summaries by Summarizes)
  • Programming and distributed systems are two different skills needed in a data engineering team.
  • Programming can be divided into three types of programmers: coders, simple programmers, and advanced programmers.
  • Distributed systems are not easy to work with, and Big Data frameworks only make it easier to concentrate on the code instead of the RPCs and threading.
  • Companies that think Big Data frameworks make things easy are more likely to fail due to a skills gap.
  • To succeed, a data engineering team needs at least one person with both programming and distributed systems skills.

In my book Data Engineering Teams, I separate out programming as a different skill than distributed systems. The section is the “Skills Needed in a Team” and talks about the various skills that a data engineering team needs.

Several people have emailed me for clarification about this distinction. Aren’t programming and distributed systems the same thing? How are they different?

Programming

I’ll start with my definition of programming within Big Data.

I find there are three general types of programmers:

  • Coders who code in Excel, HTML, or another quasi-programming language
  • Programmers who write simple systems or use simplified frameworks
  • Programmers who write difficult backend or frontend systems

I wrote an entire article about the programming skills needed for Big Data. This article helps to define which category each member of your team falls into.

Distributed Systems

A common misconception is that Big Data frameworks make it dead simple to do Big Data. The answer is they make it easier, but don’t make it dead simple. Creating a solution is still very complicated. The frameworks just make it easier to concentrate on the code instead of the RPCs and threading.

In my experience, the companies that think Big Data frameworks make things easy are the most likely to fail. They assign teams and individual contributors without the skills to create the solution. They have a skills gap as I talk about in the book. Skills gaps lead to failure.

How Are They Different?

The two skills are different and not often found in the same members of the team. For your team to succeed, you will need at least one person with both the programming and distributed systems skills.

I gave a list of types of programmers. Let me show you how each one relates to their distributed systems skills.

The “coders” don’t have the distributed systems skills to create a data pipeline. They’re usually the consumers of the data pipeline.

The simple programmers rarely have the distributed systems skills. They’re usually the consumers of the data pipeline.

The advanced programmers have the highest probability of having the distributed skills, though it’s not 100%. They’re the ones creating the data pipeline. They’re consuming and creating value out of the data pipeline. They help the other programmers as they get stuck working with the data pipeline.

Related Posts

zoomed in line graph photo

Data Teams Survey 2023 Follow-Up

Blog Summary: (AI Summaries by Summarizes)Many companies, regardless of size, are using data mesh as a methodology.Smaller companies may not necessarily need a data mesh

Laptop on a table showing a graph of data

Data Teams Survey 2023 Results

Blog Summary: (AI Summaries by Summarizes)A survey was conducted between January 24, 2023, and February 28, 2023, to gather data for the book “Data Teams”

Black and white photo of three corporate people discussing with a view of the city's buildings

Analysis of Confluent Buying Immerok

Blog Summary: (AI Summaries by Summarizes)Confluent has announced the acquisition of Immerok, which represents a significant shift in strategy for Confluent.The future of primarily ksqlDB

Tall modern buildings with the view of the ocean's horizon

Brief History of Data Engineering

Blog Summary: (AI Summaries by Summarizes)Google created MapReduce and GFS in 2004 for scalable systems.Apache Hadoop was created in 2005 by Doug Cutting based on

Big Data Institute horizontal logo

Independent Anniversary

Blog Summary: (AI Summaries by Summarizes)The author founded Big Data Institute eight years ago as an independent, big data consulting company.Independence allows for an unbiased