I Come Not To Bury Cloudera But To Praise It

Blog Summary: (AI Summaries by Summarizes)
  • Big data vendors, such as MapR and Cloudera, are facing problems.
  • Cloudera's stock value has decreased significantly, and there is a possibility that it may be bought by a private equity firm and sold off in pieces.
  • The big data industry generates low amounts of value relative to the amount spent, which is a dirty little secret.
  • Companies with big data may wake up and start cutting people and big data projects due to the low value created.
  • To achieve value from big data, companies need to hire competent people, train them, and continue to help them. The management team also needs to change and fix how they manage their data teams.

It’s been a tumultuous past few weeks for big data vendors. First MapR is having problems (their update). Now, Cloudera is having problems.

As of today, Cloudera closed at $5.21 (June 6, 2019). To put that in perspective, at its last valuation, Confluent was valued at almost twice what Cloudera is worth now. Put another way, Cloudera is trading at a 2x multiple of yearly earnings. A 2x multiple is really low for a technology company.

For the first time, we face the real prospect of going from 3 Hadoop distributions to 0. I’m worried that Cloudera will get bought by a private equity (PE) firm and sold off in pieces. One of those pieces CDH/CDP/HDP isn’t directly profitable and I’m wondering if the PE firm will really understand that. If they don’t, CDH/CDP/HDP won’t receive the love it received before. Everyone freeloading on the big data train will wonder where all their free updates and major features went.

I’ve spent the last few days thinking about what all of this means. I’ve spent a longer amount of time thinking about Cloudera’s problems. I’ve spent an even longer amount of time thinking about the future of big data. I’m still convinced that you can only do big data with big data tools and it is possible for companies to get massive value from their data. But…

Low Value

We have a big problem in big data. We generate woefully low amounts of value relative to the amount spent. It’s really a dirty little secret we’ve had for a while. Myself and others are worried that companies with big data will wake up, calculate out their costs, calculate the value created, see the low value, and start cutting people and big data projects. I don’t want to come across as all doom and gloom because there are companies with big data problems creating massive value with their data using Hadoop, Spark, cloud, etc, but they had to put the effort in first.

There will be a bloodbath of sorts when the big data party stops. Everyone will say that big data was a fad and didn’t really do anything useful. We all move on.

I’ve been telling and helping my clients achieve the value that big data can bring. Run the right way, big data can create significant value. It’s just far more difficult to achieve that value than Cloudera and other vendors told you. There was never an easy button that just magically made all of this open source easier or work better together. This notion of easy is what I spent much of my time combating – when it comes to distributed systems, don’t believe your vendor when they tell you things are easy. This lack of easiness meant that Cloudera couldn’t do everything itself and had to create a partner ecosystem – which they tried to do.

To achieve value from big data, it wasn’t just choosing a technology or vendor or bringing all of your data together in one place. Yet, this is what companies focused on – they focused on this because their vendor told them Hadoop, Kafka, or Spark or whatever would just solve the problem. The company would buy the product and still achieve the low value. The promise of value never really was achieved.

Cloudera didn’t tell its customers the whole story. To achieve value, the customer would need to hire competent people, train them, and continue to help them. The customer’s management team would actually have to change and fix how they managed their data teams. In Silicon Valley this level of hand-holding doesn’t scale. Silicon Valley wants subscription businesses that are software-only, have high stickiness, and don’t require any consulting.

Before the IPO, Cloudera started to push this way. It started to reduce and eliminate the consulting to focus on subscriptions. But that didn’t make its customers successful. Unsuccessful customers don’t blame themselves – they blame the technology vendor and choose a new one. Cloudera loses another customer to the cloud or another company promising how easy it is to work with their technology.

What Really Is Hadoop?

I was reading and participating in this Twitter thread where they’re talking about MongoDB and Cloudera. From the replies it was super interesting: people still don’t know what Hadoop is. Maybe that’s really the bigger problem. Understanding Hadoop and its ecosystem is a big undertaking. Finding and solving a true big data problem is another undertaking. Put simply big data – and not just Hadoop – is long chain of big undertakings that most companies don’t realize they’ll have to do.

With MongoDB, you could understand it mostly and quickly. It is a database. With Hadoop, it was lots of different things – more correctly, it was a large ecosystem of things. It takes quite a while to understand it fully and in-depth.

This is something I’ve been thinking about lately. People like simple things that are easy to understand. It doesn’t matter if people are: using the right technology, the technology is well-architected, or a better fit for the use case overall. People want easy and they want it now. They want to install easily and start using it. Once things get complicated, they double down on those simple technologies whether they’re right or not. This leads to terrible workarounds and duct tape architectures. But the technology behind it is still simple!

The corollary to this is when an organization doesn’t have the time to really understand the technologies and their tradeoffs. They were looking for something easy and didn’t really want or have the ability to go deeper. These projects that flounder and go nowhere – while the team blames the technology itself. If they do get into production, the terrible implementation, poor coding, and bad business decisions come up as production outages and projects that can’t be maintained or improved.

Does this mean Hadoop is dead and was purely hype? Were people right that Hadoop could only index the web it like Google created it to do? Obviously, not. But in peoples’ minds this is what Hadoop is limited to and therefore hype. You’d be judicious in adopting Hadoop MapReduce – for example – but the entire big data ecosystem isn’t dead.

The Hortonworks Merger

I don’t believe it’s been talked about enough, but the Hortonworks merger brought productivity to a standstill. I’ve talked to several people who are still there about how smooth the merger was. It wasn’t smooth at all.

When it was first announced I surmised it would be a really difficult merger. I was right and many people left before or shortly after. Publicly, Cloudera said how much overlap there was between the two companies and that they’re working well together.

During the earnings call, Tom Reilly said:

However, our rapid execution on the Cloudera platform caused customers to wait until release to renew and expand their agreements.

That struck me as interesting and I think it shows real problems of the merger. The merger made it difficult for customers to see the value of their contracts. Cloudera’s output really stopped and customers could see that. Why should they continue to pay for something we may not get? Customers will give you some slack, but when you’re writing million dollar checks, that patience wears and they want results. In my company, I have advised my clients to not believe the roadmaps they were given or make plans based on the roadmap.

What Can Be Done?

As I said, I came to praise Cloudera not to bury it. I think that Cloudera – with the right direction and leadership – could be fixed. As a former Clouderan, that really makes me sad to have to say.

People

Cloudera lost its way on the people. When I left I had a half-hearted please stay. I thought that was just my experience or it was me personally. Then I started talking to other people who left and I asked them about their experience leaving. It was all similar. Their management reactions ranged from arrogantly saying don’t let the door hit you to half-hearted please stay.

It was after these conversations that I realized Cloudera had really lost its way. These weren’t your average people. These were some of the more public faces of Cloudera – often published authors. These people went on to found new companies that are doing well. The new ideas that were the basis of their new companies were often rejected by Cloudera’s management. Cloudera really lost out by not incentivizing people to stay.

Cloudera needs to get the original band back together. Now, that’s going to have to happen with acquisitions and acquihires. In other words, it’s more expensive that it would have been to keep them originally.

This would include removing much of Cloudera’s management. I’m sure many of the problems are due to politics, infighting, and favoritism. Some of these managers are gone, but some are still there. There really needs to be some house cleaning.

There is a level of arrogance at Cloudera. Some of this arrogance is just the Silicon Valley arrogance (I work/worked at Company X therefore I’m smarter/better/faster than you) that you have to deal with and some were born that way. You have some really smart people and some who’ve done some quantifiably great things. Cloudera was this weird juxtaposition of egos – some founded and unfounded. Some of the healthiest egos had the lowest appreciable value creation and that clashed with the people actually creating value at Cloudera. Cloudera lost some good people who tired of the ego clashes.

Technology

Cloudera’s new technologies really didn’t have a market fit or were half-assed approximations of another startup’s technology. It seemed like Cloudera’s salespeople would say they’re losing sales to technology X. Cloudera would then cobble together some open source and position itself against technology X.

There really wasn’t much in the way of technical leadership coming from Cloudera. At one time Cloudera was really leading.

I remember the first time I started asking Cloudera about Kafka. I saw the market need and desire for Kafka early on. I’ll never forget the conversations I had with Clouderans about Kafka. They dismissed it out of hand at a time when it was really taking off. Overall, Cloudera is really missing out on the technical vision of real-time and messaging systems.

If I were Cloudera, I’d be reevaluating the technical stack I’m pushing and see the significantly better technologies out there. These technologies are not the ones Cloudera is currently pushing. Cloudera could establish itself as a leader again. It would take some acquisitions too.

Positioning

I’ll apologize ahead of time to whoever created this “Modern Data Warehouse” marketing slogan, but I don’t think it was a single person. I’m guessing this was the product of a committee.

The “Modern Data Warehouse” positioning is terrible and needs to be removed 1984-style. It’s like saying, “I’ve given up on being interesting or new. I’m going to wear sweat pants from now on – just not ones with holes in them”. A big part of the low value we generate in big data is because of data warehousing and its mindset.

The Edge-to-Edge AI positioning is much better. However, it’s really only resonates with IoT companies who actually have or need an edge. The majority of Cloudera’s customer aren’t doing edge now or in the future.

Evangelism

Cloudera never evangelized its products well to developers. Cloudera did inherit some evangelists during the Hortonworks merger.

A quick story about this. I tried to become Cloudera’s first developer advocate before I left. I ended up just leaving. In the fun of trying to play the politics of the job move, I realized Cloudera’s focus wasn’t on developer success. It was on egos and fiefdoms.

Cloudera mostly focused on the evangelization to management, but never focused on the evangelization to developers. I saw that missing piece and tried to fill it. It seems like Cloudera just assumed that developers would be pushed to the technologies by management. I think part of Confluent’s success was realizing the push had to be from both sides. Developer advocates pushing the frontline developers and management hearing about it from other sources.

Getting Back

You could summarize this post by saying if Cloudera wants to survive, it needs to get back to what made early Cloudera so good. This was the right mix of leadership, people, and technology. It’s missing now and I’m hoping they can get their groove back.

A big thanks to the former Clouderans and others who read this and gave feedback.

Related Posts

The Difference Between Learning and Doing

Blog Summary: (AI Summaries by Summarizes)There are several types of learning videos: hype, low effort, novice, and professional.It is important to avoid hype, low-effort, and

The Data Discovery Team

Blog Summary: (AI Summaries by Summarizes)The concept of a “data discovery team” is introduced, which focuses on searching for data in an enterprise data reality.Data