Laptop on a table showing a graph of data

Data Teams Survey 2023 Results

Blog Summary: (AI Summaries by Summarizes)
  • A survey was conducted between January 24, 2023, and February 28, 2023, to gather data for the book "Data Teams" and to update a previous survey from late 2020.
  • The survey aimed to gather information about how management uses data teams, the value they're creating, and how they're creating it, as well as the economic effects on data teams.
  • 81 respondents participated in the survey.
  • Data science and data engineering teams are well-represented, but operations teams are only present in half of the respondents.
  • Merely saying you have a team doesn't mean it's the right team. The individual contributors must meet the criteria and definitions to represent the job title.

Between January 24, 2023, and February 28, 2023, I ran a survey to get more data for my latest book Data Teams, and to update my previous survey from late 2020. Overall, we had 81 respondents.

This survey was designed to get information about how management uses data teams, the value they’re creating, and how they’re creating it. The survey asked about the best and worst practices that teams are using or experiencing. We rounded out the survey by asking about the economic effects on data teams. In this post, I’ll go through the results and analysis of what the results mean.

Data Teams

The fundamental thesis of Data Teams is that companies need data science, data engineering, and operations to be successful in their data projects. We start asking some questions about each respondent’s data teams.

a bar graph showing percentages of which data teams are already in a company

Figure 3 – Of the three data teams, which teams do you already have at your company?

We see that data science and data engineering are well-represented. However, operations teams are only present in half of the respondents.

a bar graph showing if a company's definition of a data scientist, data engineer, and an operations engineer matches the definition of the title
Figure 4 – Does the company definition of a team match the book’s definition?

Merely saying you have a team doesn’t mean it’s the right team. The individual contributors must meet the criteria and definitions to represent the job title.

Merely saying you have a team doesn’t mean it’s the right team. The individual contributors must meet the criteria and definitions to represent the job title. We see well-represented responses for data scientists and data engineers with operations lagging similarly.

Overall, we see opportunities to improve operations. These improvements include the presence of operations and getting the right people in place.

Maturity and Success

It’s essential to gauge how far the respondents are in their big data journey.

 

a pie chart showing percentages of how mature a big data effort is in a company

Figure 5 – How mature are your big data efforts?

66.6% of respondents said they are in production or further along, while 33.4% are in pre-production.

a bar graph showing the success rate of big data projects

Figure 6 – How successful do you think your big data projects are? One means “Highly Unsuccessful,” and five means “Highly Successful.”

a bar graph showing the percentage of how successful would the business say projects are

Figure 7 – How successful would the business say your projects are? One means “Highly Unsuccessful,” and five means “Highly Successful.”

I’ve found perceptions of success to be highly varied. To get a higher fidelity of success, I asked two questions. I asked respondents how they felt and what the business would say about the success (the higher the number, the more successful the project). From the responses, the individuals thought that the business would say they’re more successful than what they would say.

a bar graph of combined business and personal project success metric

Figure 8 – Combined business and personal project success metric

To get a combined view of the business and personal opinions of success, I added the two numbers to get a range of 1 to 10. This combination showed predominantly 6 to 8 ratings from respondents.

A line chart showing the combined success with number of data teams and its description

Figure 9 – Combined success with the team representation breakdown

Taking the perceived combined business and personal success, I then broke out success grouped with which teams the organization had. We can see some clear correlations:

    • Having all three teams (green) or data science and data engineering (medium blue) creates the highest success
    • Companies with data science and operations (light blue) or data engineering and operations (dark blue) have more difficulty creating success and are less popular
    • Companies with only one team (shades of red) have great difficulty creating high value
    • When all teams are missing, there is the lowest success
    Word cloud of answers to “What do you think you nailed in your management of data teams?”

    Figure 10 – Word cloud of answers to “What do you think you nailed in your management of data teams?”

    To give respondents a way to provide an unfiltered opinion, I asked two questions. Both were designed to provide us with a clear understanding of what the respondents thought they did well or poorly. I created a word cloud to make their responses more intelligible.

    I asked, “What do you think you nailed in your management of data teams?” to get an idea of what went well. Many respondents were happy about their data teams and the people that they had.

    Word cloud of answers to “What do you think are the weakest points in your management of data teams?”

    Figure 11 – Word cloud of answers to “What do you think are the weakest points in your management of data teams?”

    I asked, “What do you think are the weakest points in your management of data teams?” to get an idea of what went poorly. Many respondents lamented that they didn’t connect with or work with the business side as they wished. Others mentioned that they battle shifting priorities.

    As you’ll notice, both word clouds share many of the same words. It almost seems as though the best and worst things conflict. The reality is that the keys to success and failure are the same. For example, the best teams focus on incorporating the business, while the weakest teams fail to focus on the business.

    How Did They Do It?

    To better understand best and worst practices, I asked two questions to get the specific reasons why teams were challenged to create value and what allowed them to create value.

    A line graph showing the biggest challenges to achieving the highest possible value from data

    Figure 12 – What do you think have been the biggest challenges to achieving the highest possible value from your data?

    The respondents consistently selected friction as their biggest challenge. After that, the two most common issues revolved around needing more individual contributors and poor feedback.

    a bar graph showing percentages of which various practices data teams apply to make sure their data projects are meeting business needs

    Figure 13 – Which best practices are you doing to make sure data projects are meeting business needs?

    On the best practice side, teams value working with the business on data projects. Three other consistent best practices were continuous integration, having a qualified data engineering team, and leveraging automation to make tasks easier.

    You’ll remember the questions about the value a person and the business would say was created. I found it interesting to just look at what the highest and lowest value creation respondents would say.

    The highest value creation (combined score of 10) respondents selected having all the teams with the right skills as a best practice. They focused on creating velocity and using automation. For their write-ins, they added “listening and sharing responsibility’ along with using CI/CD.

    The lowest value creation (combined score of 3) respondents selected missing many, often all, of the data teams as a significant contributor to low-value creation. They also point to friction as getting in the way. Oddly enough, all of them “hired a consulting company, and they aren’t delivering,” which is a widespread way I see larger companies fail with data projects. For their write-ins, they added “shifting priorities,” “not being able to persuade the rest of the company that change is ok,” and “poor leadership.”

    COVID-19, Economic Trends, and Data Teams

    To round out the survey, I wanted to get some responses on how COVID-19 affected data teams as we’re deep into its long-term effects.

    A bar graph displaying the percentage of how much negative impact has happened on a data team's productivity upon working remotely

    Figure 14 – Has working remotely impacted your team’s productivity negatively? One means “Strongly Disagree,” and five means “Strongly Agree.”

    Many data teams are working remotely right now. I wanted to determine if this was affecting them negatively. There can be several reasons that data teams could be negatively impacted, such as home distractions, lack of cluster access, or improper cluster setups. The survey respondents said that they weren’t being affected negatively or that it was neutral.

    A bar graph showing the percentage of changes in the perception of the value of data within a company during the COVID-19 era

    Figure 15 – Has the perception of the value of data within your company changed in the COVID-19 era? One means “Strongly Disagree,” and five means “Strongly Agree.”

    For some companies, COVID-19 was a wake-up call for data. I asked survey respondents to tell if COVID-19 affected the company’s perception of the value of data (higher numbers strongly agree and lower numbers strongly disagree). For most respondents, COVID-19 was either neutral or agreed that their views changed.

    A pie chart displaying the percentages of a company's decision in pivoting their data strategy in the COVID-19 era
    Figure 16 – Have you had to pivot your data strategy in the COVID-19 era?

    COVID-19 and the subsequent economic downturn brought tremendous changes to companies worldwide.

    COVID-19 and the subsequent economic downturn brought tremendous changes to companies worldwide. I asked if these business changes affect their data strategies. Half of the people said no, while the other half were either partially or fully pivoting.

     

    A pie chart displaying the percentage of company's decision in having to change the size of data teams
    Figure 17 – Is your company planning to change the size of the data team? (Note: while companies are laying off, this question is specific to data teams.)

    With all of the announcements of layoffs, I wanted to get an idea of how it directly affects the data teams. 53% of respondents said they are increasing the size of the team, and 34.6% are keeping things the same. Only 12.3% said there could be some decrease.

    Data teams should not take layoffs lightly and should focus on creating value.

    These numbers confirm what I’ve seen and heard anecdotally. Company-wide layoffs weren’t affecting the data because they were already understaffed, and any further decrease would compound issues. I highly recommend that data teams not take layoffs lightly and focus on creating value.

    Changes Since 2020 to Now

    With data from the 2020’s survey, I wanted to see what, if anything, is changing.

    A line chart comparing the number of data teams and their total value creation in the years 2020 and 2023

    Figure 18 – Differences in value creation between 2020 and 2023

    I hoped and expected to see higher value creation in 2023 compared to 2020. The chart doesn’t show any meaningful changes. The average for 2023 is 7.02, and in 2020 is 7.08. It is concerning to me that there wasn’t an increase in value creation over three years.

    A bar chart displaying the team makeup and descriptions between 2020 and 2023

    Figure 19 – Team makeup and descriptions between 2020 and 2023

    The team makeup and descriptions give me more hope. As we can see, there is a decline in missing and single-team data teams and an increase in two-team data teams.

    What Good And Bad Looks Like

    A big part of my work and the goal of this survey is to establish best practices with data. This data allows us to see what the lowest and highest value-creating data teams are doing. I’ve also added methodology into the mix.

    To establish the bad, I took all of the responses that created a total value creation of 4 and lower values. For the good, I took all of the responses that created a total value creation of 9 and above.

    A bar chart displaying the percentage of various methodologies being used by data teams

    Figure 20 – Which methodologies are data teams using?

    There are some surprises about the breakdown of methodologies. I expected a good representation of DataOps, Data Mesh, and Center of Excellence. I am surprised to see how many people use no methodology or a homegrown one. Perhaps, I’ll have to dig into what these homegrown methodologies are.

    I wasn’t surprised about the lack of usage of Data Fabric. As much as Gartner is pushing it, I don’t see it in the field much. Despite Gartner saying Data Mesh is obsolete, 21% of respondents used it.

    A bar chart displaying the various methodologies being used by low and high-value creation teams

    Figure 21 – Methodologies being used by low and high-value creation teams

    Comparing low and high-value creation, we see they use some of the same methodologies. This shared usage shows there are broadly applicable methodologies to help low-value creation teams improve their value creation. We also see high-value creation teams only use some methodologies.

    A bar chart displaying the biggest challenges experienced by low and high-value creation teams

    Figure 22 – Biggest challenges experienced by low and high-value creation teams

    Some problems always stay, no matter how hard you try or the level of success with the team. Both high and low-value teams share some of the same challenges. However, high-value creation teams are experiencing more advanced challenges.

    A bar chart displaying the best practices used by low and high-value creation teams

    Figure 23 – Best practices used by low and high-value creation teams

    The most significant difference between low and high-value creation teams is their use of best practices.

    The most significant difference between low and high-value creation teams is their use of best practices. The high-value creation teams use far more best practices than their low-value creation counterparts. While the challenges were similar, best practices are what set teams apart.

    A bar chart displaying the team makeup and description for low and high-value creation teams

    Figure 24 – Team makeup and description for low and high-value creation teams

    All three teams are required to generate the highest possible value.

    The final comparison and significant differences are in the team makeup and description. The low-value creation teams skew toward one or two teams, while the high-value creation teams skew toward two or three teams. This divide definitely supports my thesis that all three teams are required to generate the highest possible value.

    Demographic Data

    Pie chart breaking down the survey respondents by their positions in the company.

    Figure 1 – What is your position at your company?

    Since the survey concerns management, we’ll start with the breakdown of positions. 61.7% have a management position. The other positions represented were data engineers, architects, consultants, project managers, and project managers.

    Pie chart breaking down the survey respondents by the size of their company.

    Figure 2 – How big is your company?

    Another critical question is the size of the companies represented. Companies of different sizes have different organizational needs, and we can see many employees represented.

    Key Takeaways

    The data clearly shows a correlation between value creation and having data teams. The highest-value producers credit their data teams, while the lowest-value producers lament their lack of data teams. The highest-value creating teams are doing the most best practices.

    It’s critical that management looks at friction and its impact on the data teams.

    It’s critical that management looks at friction and its impact on the data teams. For some companies, this means data projects go nowhere or underperform. Working well with the business side is equally important.

    We can see that COVID-19 and remote work aren’t affecting teams’ productivity. In some companies, the economy changes the perception of data within the company and causes them to pivot their data strategy. Management should look for any productivity issues and verify that their data strategy doesn’t need to be slightly updated or pivoted to leverage data better.

    If you’d like to learn more about these results or how to use them to accelerate your data team, I would be happy to talk to you further.  Please reach out to me here.

    A big thanks to everyone who filled out the survey and helped me promote it. It represents a unique look at what’s happening in a vendor-neutral environment.

    Related Posts

    The Difference Between Learning and Doing

    Blog Summary: (AI Summaries by Summarizes)There are several types of learning videos: hype, low effort, novice, and professional.It is important to avoid hype, low-effort, and

    The Data Discovery Team

    Blog Summary: (AI Summaries by Summarizes)The concept of a “data discovery team” is introduced, which focuses on searching for data in an enterprise data reality.Data

    Black and white photo of three corporate people discussing with a view of the city's buildings

    Current 2023 Announcements

    Blog Summary: (AI Summaries by Summarizes)Confluent’s Current Conference featured several announcements that are important for both technologists and investors.Confluent has two existing moats (replication and