- A survey was conducted between January 24, 2023, and February 28, 2023, to gather data for the book "Data Teams" and to update a previous survey from late 2020.
- The survey aimed to gather information about how management uses data teams, the value they're creating, and how they're creating it, as well as the economic effects on data teams.
- 81 respondents participated in the survey.
- Data science and data engineering teams are well-represented, but operations teams are only present in half of the respondents.
- Merely saying you have a team doesn't mean it's the right team. The individual contributors must meet the criteria and definitions to represent the job title.
This survey was designed to get information about how management uses data teams, the value they’re creating, and how they’re creating it. The survey asked about the best and worst practices that teams are using or experiencing. We rounded out the survey by asking about the economic effects on data teams. In this post, I’ll go through the results and analysis of what the results mean.
The fundamental thesis of Data Teams is that companies need data science, data engineering, and operations to be successful in their data projects. We start asking some questions about each respondent’s data teams.
Figure 3 – Of the three data teams, which teams do you already have at your company?
We see that data science and data engineering are well-represented. However, operations teams are only present in half of the respondents.
Merely saying you have a team doesn’t mean it’s the right team. The individual contributors must meet the criteria and definitions to represent the job title.
Merely saying you have a team doesn’t mean it’s the right team. The individual contributors must meet the criteria and definitions to represent the job title. We see well-represented responses for data scientists and data engineers with operations lagging similarly.
Overall, we see opportunities to improve operations. These improvements include the presence of operations and getting the right people in place.
Maturity and Success
It’s essential to gauge how far the respondents are in their big data journey.
Figure 5 – How mature are your big data efforts?
66.6% of respondents said they are in production or further along, while 33.4% are in pre-production.
Figure 6 – How successful do you think your big data projects are? One means “Highly Unsuccessful,” and five means “Highly Successful.”
Figure 7 – How successful would the business say your projects are? One means “Highly Unsuccessful,” and five means “Highly Successful.”
I’ve found perceptions of success to be highly varied. To get a higher fidelity of success, I asked two questions. I asked respondents how they felt and what the business would say about the success (the higher the number, the more successful the project). From the responses, the individuals thought that the business would say they’re more successful than what they would say.
Figure 8 – Combined business and personal project success metric
To get a combined view of the business and personal opinions of success, I added the two numbers to get a range of 1 to 10. This combination showed predominantly 6 to 8 ratings from respondents.
Figure 9 – Combined success with the team representation breakdown
Taking the perceived combined business and personal success, I then broke out success grouped with which teams the organization had. We can see some clear correlations:
- Having all three teams (green) or data science and data engineering (medium blue) creates the highest success
- Companies with data science and operations (light blue) or data engineering and operations (dark blue) have more difficulty creating success and are less popular
- Companies with only one team (shades of red) have great difficulty creating high value
- When all teams are missing, there is the lowest success
Figure 10 – Word cloud of answers to “What do you think you nailed in your management of data teams?”
To give respondents a way to provide an unfiltered opinion, I asked two questions. Both were designed to provide us with a clear understanding of what the respondents thought they did well or poorly. I created a word cloud to make their responses more intelligible.
I asked, “What do you think you nailed in your management of data teams?” to get an idea of what went well. Many respondents were happy about their data teams and the people that they had.
Figure 11 – Word cloud of answers to “What do you think are the weakest points in your management of data teams?”
I asked, “What do you think are the weakest points in your management of data teams?” to get an idea of what went poorly. Many respondents lamented that they didn’t connect with or work with the business side as they wished. Others mentioned that they battle shifting priorities.
As you’ll notice, both word clouds share many of the same words. It almost seems as though the best and worst things conflict. The reality is that the keys to success and failure are the same. For example, the best teams focus on incorporating the business, while the weakest teams fail to focus on the business.
How Did They Do It?
To better understand best and worst practices, I asked two questions to get the specific reasons why teams were challenged to create value and what allowed them to create value.
Figure 12 – What do you think have been the biggest challenges to achieving the highest possible value from your data?
The respondents consistently selected friction as their biggest challenge. After that, the two most common issues revolved around needing more individual contributors and poor feedback.
Figure 13 – Which best practices are you doing to make sure data projects are meeting business needs?
On the best practice side, teams value working with the business on data projects. Three other consistent best practices were continuous integration, having a qualified data engineering team, and leveraging automation to make tasks easier.
You’ll remember the questions about the value a person and the business would say was created. I found it interesting to just look at what the highest and lowest value creation respondents would say.
The highest value creation (combined score of 10) respondents selected having all the teams with the right skills as a best practice. They focused on creating velocity and using automation. For their write-ins, they added “listening and sharing responsibility’ along with using CI/CD.
The lowest value creation (combined score of 3) respondents selected missing many, often all, of the data teams as a significant contributor to low-value creation. They also point to friction as getting in the way. Oddly enough, all of them “hired a consulting company, and they aren’t delivering,” which is a widespread way I see larger companies fail with data projects. For their write-ins, they added “shifting priorities,” “not being able to persuade the rest of the company that change is ok,” and “poor leadership.”
COVID-19, Economic Trends, and Data Teams
To round out the survey, I wanted to get some responses on how COVID-19 affected data teams as we’re deep into its long-term effects.
Figure 14 – Has working remotely impacted your team’s productivity negatively? One means “Strongly Disagree,” and five means “Strongly Agree.”
Many data teams are working remotely right now. I wanted to determine if this was affecting them negatively. There can be several reasons that data teams could be negatively impacted, such as home distractions, lack of cluster access, or improper cluster setups. The survey respondents said that they weren’t being affected negatively or that it was neutral.
Figure 15 – Has the perception of the value of data within your company changed in the COVID-19 era? One means “Strongly Disagree,” and five means “Strongly Agree.”
For some companies, COVID-19 was a wake-up call for data. I asked survey respondents to tell if COVID-19 affected the company’s perception of the value of data (higher numbers strongly agree and lower numbers strongly disagree). For most respondents, COVID-19 was either neutral or agreed that their views changed.
COVID-19 and the subsequent economic downturn brought tremendous changes to companies worldwide.
COVID-19 and the subsequent economic downturn brought tremendous changes to companies worldwide. I asked if these business changes affect their data strategies. Half of the people said no, while the other half were either partially or fully pivoting.
With all of the announcements of layoffs, I wanted to get an idea of how it directly affects the data teams. 53% of respondents said they are increasing the size of the team, and 34.6% are keeping things the same. Only 12.3% said there could be some decrease.
Data teams should not take layoffs lightly and should focus on creating value.
These numbers confirm what I’ve seen and heard anecdotally. Company-wide layoffs weren’t affecting the data because they were already understaffed, and any further decrease would compound issues. I highly recommend that data teams not take layoffs lightly and focus on creating value.
Changes Since 2020 to Now
With data from the 2020’s survey, I wanted to see what, if anything, is changing.
Figure 18 – Differences in value creation between 2020 and 2023
I hoped and expected to see higher value creation in 2023 compared to 2020. The chart doesn’t show any meaningful changes. The average for 2023 is 7.02, and in 2020 is 7.08. It is concerning to me that there wasn’t an increase in value creation over three years.
Figure 19 – Team makeup and descriptions between 2020 and 2023
The team makeup and descriptions give me more hope. As we can see, there is a decline in missing and single-team data teams and an increase in two-team data teams.
What Good And Bad Looks Like
A big part of my work and the goal of this survey is to establish best practices with data. This data allows us to see what the lowest and highest value-creating data teams are doing. I’ve also added methodology into the mix.
To establish the bad, I took all of the responses that created a total value creation of 4 and lower values. For the good, I took all of the responses that created a total value creation of 9 and above.
Figure 20 – Which methodologies are data teams using?
There are some surprises about the breakdown of methodologies. I expected a good representation of DataOps, Data Mesh, and Center of Excellence. I am surprised to see how many people use no methodology or a homegrown one. Perhaps, I’ll have to dig into what these homegrown methodologies are.
I wasn’t surprised about the lack of usage of Data Fabric. As much as Gartner is pushing it, I don’t see it in the field much. Despite Gartner saying Data Mesh is obsolete, 21% of respondents used it.
Figure 21 – Methodologies being used by low and high-value creation teams
Comparing low and high-value creation, we see they use some of the same methodologies. This shared usage shows there are broadly applicable methodologies to help low-value creation teams improve their value creation. We also see high-value creation teams only use some methodologies.
Figure 22 – Biggest challenges experienced by low and high-value creation teams
Some problems always stay, no matter how hard you try or the level of success with the team. Both high and low-value teams share some of the same challenges. However, high-value creation teams are experiencing more advanced challenges.
Figure 23 – Best practices used by low and high-value creation teams
The most significant difference between low and high-value creation teams is their use of best practices.
The most significant difference between low and high-value creation teams is their use of best practices. The high-value creation teams use far more best practices than their low-value creation counterparts. While the challenges were similar, best practices are what set teams apart.
Figure 24 – Team makeup and description for low and high-value creation teams
All three teams are required to generate the highest possible value.
The final comparison and significant differences are in the team makeup and description. The low-value creation teams skew toward one or two teams, while the high-value creation teams skew toward two or three teams. This divide definitely supports my thesis that all three teams are required to generate the highest possible value.
Figure 1 – What is your position at your company?
Since the survey concerns management, we’ll start with the breakdown of positions. 61.7% have a management position. The other positions represented were data engineers, architects, consultants, project managers, and project managers.
Figure 2 – How big is your company?
Another critical question is the size of the companies represented. Companies of different sizes have different organizational needs, and we can see many employees represented.
The data clearly shows a correlation between value creation and having data teams. The highest-value producers credit their data teams, while the lowest-value producers lament their lack of data teams. The highest-value creating teams are doing the most best practices.
It’s critical that management looks at friction and its impact on the data teams.
It’s critical that management looks at friction and its impact on the data teams. For some companies, this means data projects go nowhere or underperform. Working well with the business side is equally important.
We can see that COVID-19 and remote work aren’t affecting teams’ productivity. In some companies, the economy changes the perception of data within the company and causes them to pivot their data strategy. Management should look for any productivity issues and verify that their data strategy doesn’t need to be slightly updated or pivoted to leverage data better.
If you’d like to learn more about these results or how to use them to accelerate your data team, I would be happy to talk to you further. Please reach out to me here.
A big thanks to everyone who filled out the survey and helped me promote it. It represents a unique look at what’s happening in a vendor-neutral environment.