When I start working with a team, one the first questions I ask is “how much time do you spend creating new features versus making sure those new features don’t break something else.” Put a different way, how much time is your development team spending creating new features versus worrying how those new features will affect an overly-complicated system.
The answer to this question will vary. At large enterprises, the average is 70% of developer’s time is spent on preventing breakage. I’ve had some organizations say as high as 90%. (As a manager or executive, I recommend you ask your team this question because most managers don’t know to ask this question or don’t know the answer to this question.)
There are obvious OPEX costs due to this time spent, but what about other savings? Often, overly-complicated systems make poor usage of system resources. Usually, these poorly used systems are the most expensive hardware- and software-wise, like a relational database or mainframe. I had one company that expected to save over $1,000,000 in hardware costs by reducing system complexity.
To fix these complexity issues, we have to pay down our technical debt. One common way to decrease system complexity is by using a messaging technology like Apache Pulsar and an architectural pattern called event sourcing. The event sourcing pattern solves problems where you’re having problems distributing knowledge to the applications that need it. For other problems, you might solve them with a messaging technology and another architectural pattern.
In enterprise systems, it’s all about knowledge. What just happened? How do I know when something happened? How will I know or by what means will I know that something happened? These are difficult, but worthwhile endeavors.
Let’s take a non-technical look at event sourcing and how it helps teams create less complex systems. For a more technical look at event sourcing, go here.
Let’s start off by looking at a simple web backend and relational database interaction.
Here, the website’s backend code had an event. Let’s say the user bought something. That knowledge of the user buying something is directly put into the relational database. This is how the vast majority of website systems are created.
Extending the System
The bigger or overarching issue starts happening as we start to add systems.
Let’s say that the marketing team creates a new application that analyzes orders. The app needs to know whenever someone orders. Since the web backend directly puts that knowledge in the relational database, the marketing application will have to constantly query or poll for new orders. Ideally, we’d prefer to be notified when an action happens, but that isn’t possible due to the system design.
The issue usually has to do with coupling. Coupling means how closely or strongly linked one system is to another. A strongly coupled system like this makes it really difficult add a new application that wants to know about other information. A loosely coupled system is one that isn’t directly dependent on another system for its knowledge. Loosely coupled systems often use event sourcing to allow applications to get that information or knowledge without having to worry about the downstream effect.
While simply adding the marketing application doesn’t look too terrible, it’s just the start of new complexity. With the blink of eye, you have 50 different applications all polling the database and trying to communicate with each other. This is when you hit 90% of your developers’ time trying not to break everyone else’s applications.
System Complexity and Its Direct Effects
When I work with organizations experiencing high system complexity and 90% of their developers’ time not creating value, I like to ask “would you like to double your developers’ productivity without hiring a single new developer?” They always say yes. By reducing system complexity by even a small amount, you can double developer productivity. Let’s imagine 90% of your developers’ time is going into complexity issues and not creating any value. If we were to reduce this by even a small amount – like 10% – we could double the developer productivity. Now, the developers are spending 20% of their time creating value and 80% on complexity.
How do we reduce the system complexity? We start to loosely couple our system.
As we see from the diagram, Pulsar is the center of our data movement. We’re using it as the way to move information and knowledge around the system. Now, when we add new applications, like our marketing application, they can be easily added. We won’t create a jumbled mess of applications trying to poll the database. All actions can be published as an event.
The Benefits of Event Sourcing
As long as the team adheres to best practices with Pulsar, we can have 50 different systems all being notified when a order happens. When modifying an existing consumer of data, we don’t have to worry about its secondary or tertiary effects. By not having to worry about these effects, the developers are spending their time creating value instead of worrying about how their changes affect others.
Moving to event sourcing is well worth the effort. Although I’ve focused on the developer productivity aspect, there are many other benefits. Your operational complexity will go down, resulting in higher uptimes. Your architectural complexity will go down and new hires can come up to speed faster. The list goes on.
If your team is suffering due to the complexity of a system, I highly encourage you to look at event sourcing and Apache Pulsar. They will give your team the tools it needs to get ahead and back to being productive.
Full disclosure: this post was supported by Streamlio. They are commercializing Apache Pulsar.