Cloudera has merged with/purchased Hortonworks. As a former Clouderan, it’s interesting to see this move on several levels. I’m going to share my insights from the outside as a former insider.

Full Disclosure: Although I’m former Cloudera, I don’t own any shares of Cloudera or Hortonworks and don’t plan to purchase any in the short-term. I left Cloudera 4 years ago as of this writing.

One Big Happy Family?

Cloudera and Hortonworks are fierce rivals. This didn’t just play out between the salespeople; it was at all parts of the company. There really was different levels of angst between the individuals at Cloudera and Hortonworks. Personally, I didn’t subscribe to the two minutes of hate against Hortonworks.

There was an unspoken rule that you didn’t leave Cloudera and go to Hortonworks and vice versa. The few that did were “cut off” off by the former coworkers on the other side. You could go anywhere, just not Hortonworks.

I’m honestly curious how much two companies with this much history can bury the hatchet. What do you do if the other person is another salesperson who you hate and have a long history with? Worse yet, what if that person is your boss now?

The programmers were less anti-Hortonworks. They told me they have to work and coordinate with Hortonworks because Hadoop and other projects are open source. People from Cloudera and Hortonworks were all contributing and working together. They need to get their jobs done.

Open Source Proxy Fights

I keep on thinking of the pre-merger being more like the Cold War. The two powers never fought each other outright, but there were several proxy wars. These were wars like Korea or Vietnam.

Cloudera and Hortonworks fought each other in the trenches of the Apache world. Instead of adopting a single technology, they would go create their own and open source it to the Apache Foundation. I think the clearest example of this is Apache Sentry and Apache Ranger. Basically, Sentry came from Cloudera and Ranger was from Hortonworks. They do the same thing.

This allowed the two companies to say that your governance projects were open source and there wasn’t any vendor lock in. The difference was that neither side supported the other project and that it was a de facto lock in.

I think the end customers will be losers on this one. Their operations teams will have to migrate to whatever Cloudera/Hortonworks decides is the winner. I really doubt they’ll keep both projects going. The best case scenario is that the projects merge and there’s a clear upgrade path. No matter what, they will create an upgrade path. It’s more a question of how ugly and painful it will be.

There were other technologies that were good, but since they were viewed as coming from the competitor, the other side wouldn’t adopt it. Apache Impala was a clear example of this. It came from Cloudera and that was perception even after it was donated to the Apache Foundation. I’d tell people it would be a cold day in hell before Hortonworks started supporting Impala.

A Better Roadmap?

I’m hoping one win for customers will be a better collaboration on roadmap and new additions. While the development teams mostly worked well together, the bigger picture was the exception. A good example of this was Hadoop’s YARN.

This was one where Hortonworks really pushed it through the process. They IPO’d really positioning how they were the company behind YARN and were the best supporter of it.

The problem was, YARN wasn’t a move forward for Hadoop. It took years to fix. I’m hoping a merged company would avoid this and not push in a change for the sake of distinguishing their company.

The Open-Sourciest

There’s a core company culture difference between Cloudera and Hortonworks. Hortonworks always marketed itself as the most open source. Cloudera marketed that an entirely open source company isn’t sustainable.

This difference in perception wasn’t just at the marketing level. It permeated down to the employees themselves. The people who joined Hortonworks really wanted everything to be open source. The people who joined Cloudera were ok with a mix of closed and open source.

This may not sound like a huge difference, but it made all the difference to the technical people. I’m curious how Cloudera/Hortonworks will strike this balance. This will be enough of an issue that some technical people will leave the new company because it won’t be 100% open source. I honestly don’t believe that Cloudera will start open sourcing everything.

The biggest manifestation of this will be Cloudera Manager and Apache Ambari. Which one lives on after the merger? Cloudera Manager is closed source and proprietary. Apache Ambari is from Hortonworks and open source. There wasn’t any crossover between the two products. If you were a Cloudera customer, you used Cloudera Manager or, if you’re a Hortonworks customer, you used Ambari. Which one will customers be forced to use?

Business-level

At a business-level, I think the merger makes sense. Cloudera and Hortonworks were always battling for the same piece of pie. They were doubling up on all sorts of expenses. Cloudera sponsored Strata. Hortonworks went off and created Dataworks Summit. The salespeople were going after the same companies. I’m really curious how the two companies are going to divvy up the existing accounts where both Cloudera and Hortonworks are deployed.

The other business manifestation is how this plays out for businesses that aren’t Cloudera and Hortonworks. Since neither company has a viable real-time strategy, those smaller, real-time companies really benefit. This move also leaves MapR as the only other large Hadoop player.

I think the market will have an even greater need for independent voices and analysis on technology. There is going to be a larger monoculture. You will have Cloudera saying technology X is the way to go, but the reality is that technology Y is better. End customers are really going to need that external help to get the real story.

No matter what, I wish my former coworkers at Cloudera and friends at Hortonworks the best of luck!

Share This