If you haven’t heard, Confluent announced they’re buying Immerok. This purchase represents a significant shift in strategy for Confluent. I started a Twitter thread with some of my initial thoughts, but I want to write a post giving more analysis and opinions.
In short, I still echo the sentiment from my original tweet “This was always the way it should have been. I feel sorry for those who bought into ksqlDB.” I’ve always been vocal about ksqlDB’s and Kafka Stream’s limitations. Now, we’re seeing the market’s reaction to the technologies.
The Future of ksqlDB and Kafka Streams
With this announcement, the future of primarily ksqlDB and, to a lesser extent, Kafka Streams comes into view.
The announcement covers that by saying, “of course we continue to invest in […] KSQL.” I think it’s quite telling that even the announcement doesn’t get ksqlDB’s name right.
The GitHub activity on ksqlDB tells a different story. You’d probably look at this graph and say, “Jesse, it looks like you’re crazy. 2022 was a banner year for activity.”
That’s until you look at Confluent’s Jenkins activity. For those who don’t know, Jenkins is an automation that isn’t writing any code and performs tasks such as activity updates.
Looking back at the activity and removing Jenkins, you can see that ksqlDB’s activity is extremely low. From the commits, I’d argue that ksqlDB is in maintenance mode and without active development. The Jenkins activity just masks the issue.
It’s sad to see this happen, as customers who bought into ksqlDB are the losers. I think this factors into Confluent’s strategy, which I’ll discuss later.
Since Kafka Streams is part of the Apache project, I don’t see it going away as quickly. As a post that Ben Lorica and I wrote showed, Kafka Streams does have some demand but isn’t getting much traction. I think Confluent will eventually reduce staff working on Kafka Streams to concentrate on Flink.
You’ve probably never heard of Immerok before. That’s because it was recently founded, but that doesn’t mean it wasn’t formidable.
The story starts with Data Artisans (later renamed to Ververica). It employed some of the top people in the Apache Flink project.
In 2019, Alibaba bought Ververica. Let’s just say this wasn’t the smoothest working relationship.
In 2022, many of the Ververica employees left and formed Immerok. To give you an idea of the number, I’ve included a screenshot of Ververica’s employee count. You can see a significant drop starting in March 2022. This reduction in staff wasn’t from a layoff; it was all attrition of people leaving for Immerok.
A source told me that Confluent had an opportunity to buy Data Artisans around when Alibaba bought them. Confluent decided to pursue their Kafka Streams and ksqlDB strategy instead.
Confluent’s Original Strategy
Confluent has had a tough year. While the technology sector has had a tough time, that started in May 2022. Confluent’s problems began months earlier, in February 2022.
I think Confluent’s Flink strategy is part of a broader turnaround strategy because ksqlDB and Kafka Streams aren’t gaining traction. If you look at Confluent’s S1 filing, ksqlDB factors heaving into their differentiation and value proposition:
“From ksqlDB, which is a native data-in-motion database that allows users to build data-in-motion applications using just a few SQL statements […], we have continued to innovate and make it easier for any organization to harness data in motion.”
“Expand the Scope of our Platform with ksqlDB and Other Investments. […] Our investment in ksqlDB positions us to succeed in this emerging area as it gains adoption with customers. […] We believe our investment in ksqlDB positions us to capture this shift and use it to fuel further growth.”
The S1 was filed on June 1, 2021. After only a year and a half, we have a significant departure from the original vision.
Confluent’s Flink Strategy
I think Confluent’s Flink strategy will require cutting the message down to just talking about Flink and not Kafka Streams or ksqlDB. There is some supposition that Confluent will create a managed cluster that runs whatever code you want, and you won’t have to know which framework is running the code. In my experience, the leakage of implementation details for something as tricky as streaming will happen. You’ll need to just have one framework.
As you compare Kafka Streams and Flink, Flink can do everything technically that Kafka Streams can. Kafka Streams pales in comparison to Flink’s features. I’ve spent time trying to think of something Flink can’t do that Kafka Streams can do and I can’t. Confluent tries to say there are certain use cases that Kafka Streams does better and I’m wondering if those are the vestiges of having to position against Flink. This comparison seems like a clear point that Flink will be the system of choice.
I think Confluent will be forced to make a clear delineation and only market Flink or try to abstract it away in a similar way that Cloudera tried to (we’re a platform, don’t worry about the technologies behind it, which didn’t work). Otherwise, marketing and sales pitches will turn into complex messages that won’t help anyone. Streaming systems are already difficult enough and adding more complexity to the choice with lead to analysis paralysis.
Looking at this purchase, this isn’t a merger like Cloudera and Hortonworks. They were both in the same area with many of the same technologies. They’re overlapping in the same compute function. This overlap will lead to an internal political fight for resources such as marketing and people. Confluent will have to change its go-to-market to tell customers to use Flink. I see the two most significant issues for success being the execution and customer goodwill on a sizable technical change. I’m curious what Confluent’s sales teams have been saying about Flink all these years during customer objections, as that could come back to bite them.
Coalescing Spark Streaming vs. Flink
The general purpose real-time compute is coalescing around Spark Streaming and Flink. Kafka Streams wasn’t even in the conversation if you were doing something complex and stateful. This addition means that Databricks and Confluent will go head-to-head more. It will put even more importance on Databricks’ Project Lightspeed to deliver on its goals.
There are fewer choices now, but more mature ones that should have long-term investments to improve and support them. Overall, customers should benefit as long as they don’t have significant investments in ksqlDB or perhaps Kafka Streams.