You might have seen some posts or tweets about Amazon using open source technology and creating managed services with them. You may not understand the reasons why people are upset with this. Let me explain as neutrally as possible the issues and background information.
But first, you’ll need to understand open source business models.
Full disclosure: I don’t own shares of Confluent, Cloudera, or Amazon. That should allow for as neutral a post as possible, though I worked at Cloudera for a while.
Most of these tweet screenshots are from this Twitter thread.
Open Source Business Models
How do you make money off something that is not only free but the source code is available? That’s the question that open source companies like Cloudera and Confluent are dealing with.
This meme is from Doug Cutting. The companies can’t sell and make money on just the technology; they have to sell other products. These other products are usually: support, training, consulting, and a console or manager that makes the operations of the product easier. They make money directly off these things.
The problem is that the open source companies have to continue to develop the original project – be it Kafka or Hadoop. Those are expensive software engineers and their salaries don’t directly make the company money. Not just that, all of their work goes into the common pool of code that makes the project better. All of the companies who use the project benefit from those contributions.
The value proposition to customers is that they will be able to support, training, easier operations, and bug fixes. This is because the company is actively developing and fixing the core project. Some companies really focus on their contributions or how they have the founders of a project working at the company.
Amazon Web Service’s Business Model
Amazon Web Service’s (AWS) business model is different than a typical open source company. When Amazon creates a managed service, they focus on making it easy to start or deploy the technology. WIth Kafka, for example, AWS makes it easy to start and run a Kafka cluster.
The value proposition to customers is that they can easily start using a project. AWS’ continued support or contributions to projects like Kafka or Hadoop are not a key motivation for a customer to start using it.
Open Source Licensing
Apache licensed source code is one of the more liberal licenses out there. There is nothing that legally prevents Amazon from creating a managed service using Kafka. To deal with this, some projects have started to add clauses to their licenses expressly prohibit creating managed services with their open source project.
The motivator to give back to open source projects is more of a moral or social contract. There is more social pressure or contract that companies making extensive use of the project should give back.
Update: I’ve received a decent amount of pushback on the moral or social contract part.
Others have said there isn’t even a moral or social contract. There is absolutely no obligation for a company to give back to open source, no matter how you’re using it.
Why the Community Is Upset
With that background, we can talk about the issue at hand. Most of the discourse on the subject has been on blogs and Twitter. I think Neha sums up the issue best:
The community doesn’t like that Amazon is not complying with the social contract. As you saw in the business model, Amazon’s value prop isn’t around improving Kafka; it’s around making it easy to spin up a Kafka cluster.
That sounds like a trivial thing. It shouldn’t make a difference to anyone if Amazon decides to start using Kafka.
The issue is that companies like Cloudera and Confluent can’t compete with the pricing of Amazon. Amazon only needs to employ people that write managed service code. They don’t need to employ the others working on the project directly. Amazon’s cost are significantly lower.
The ability to commercialize the open source without having to pay direct project developers makes it very cheap for Amazon to create managed services with Hadoop and Kafka. When customers focus on the cost differences, they often choose the cheapest solution. This choice is directly affecting vendors like Cloudera.
What Should Be Done?
I asked what people think should be done. The response is to start giving back at a higher than current level.
Adrian Cockcroft responded that Amazon is giving back. Amazon is open sourcing certain projects. The community issue here is if Amazon is giving back directly to projects like Hadoop or Kafka. Others would add, that Amazon should give back at a level relative to its revenue for open source managed services.
Each project has an email list of every project and the commits (code changes) that happen. The email has information like who made the change, the change itself, and who reviewed the change. Using a Google search, you can quickly query the company that made the change. For example, a search on the Kafka commit mailing list for an Amazon commit shows no commits by someone from Amazon. Likewise, a search of the Hadoop commit mailing list shows none. It’s possible that people from Amazon are contributing, but aren’t using their @amazon.com email addresses.
Felipe Hoffa did a more in-depth look at open source contributions by cloud vendors. The numbers from Amazon were quite low.
Update: Matt Wilson points out this gist with early Amazon contributions back to Hadoop. He also confirms that Amazon employees don’t use their “amazon.com” email address when committing. This is the same standard you’ll find at other open source companies. They’re encouraged to use their “apache.org” email address.
That leaves us with a more difficult question. Does Amazon need to give back? Should we even believe that Amazon should give back? There are diverse opinions on the subject. Some of them are deeper and to go the very nature and definition of open source.
Roman also wrote a post with his synopsis.
Edited: To update with more feedback from the thread.