Unit Testing Kafka Consumers

Blog Summary: (AI Summaries by Summarizes)
  • Unit testing your Kafka code is crucial, especially for your Consumers.
  • Refactor your Consumer code to be able to change it at runtime and create a separate method for creating the KafkaConsumer.
  • Refactor the code that consumes data from the Consumer object to be callable from the unit test and not get stuck in an infinite loop.
  • Use the MockConsumer object for Kafka unit tests of the Consumer code.
  • Instantiate the Consumer to be tested, inject the MockConsumer into it, set up the MockConsumer's topic, partitions, and beginning offsets, send data with the Consumer, and consume the data.

Unit testing your Kafka code is incredibly important. It’s transporting your most important data. This is especially true for your Consumers. They are the end point for using the data. There are often many different Consumers using the data. You’ll want to unit test all of them.

In a previous post, I showed you how to unit test Producers.

Refactoring Your Consumers

First of all, you’ll need to be able to change your Consumer at runtime. Instead of using the KafkaConsumer object directly, you’ll use the Consumer interface.

public Consumer<String, String> consumer;

You can use whichever method for dependency injection, but I’m making the Consumer public so I can change it from the unit test.

Next, you’ll want to refactor the code for creating your KafkaConsumer. The creation of the KafkaConsumer should be in separate method that won’t get called by your production Consumer code.

You’ll also need to refactor the code that consumes the data from the Consumer object. This code will need to be callable from the unit test. Also, the Consumer object often consumes in an infinite loop (while (true)). You need to refactor the actual consumption code so it doesn’t get stuck in an infinite loop.

Unit Testing Your Consumer

Kafka unit tests of the Consumer code use MockConsumer object. The @Before will initialize the MockConsumer before each test.

MockConsumer<String, String> consumer;

public void setUp() {
    consumer = new MockConsumer<String, String>(OffsetResetStrategy.EARLIEST);

Have you been searching for the best data engineering training? You’ve found it. Sign up for my list so you can get my Professional Data Engineering course.

Once we’ve set the objects up, we can start testing.

public void testConsumer() throws IOException {
    // This is YOUR consumer object
    MyTestConsumer myTestConsumer = new MyTestConsumer();
    // Inject the MockConsumer into your consumer
    // instead of using a KafkaConsumer
    myTestConsumer.consumer = consumer;

    consumer.assign(Arrays.asList(new TopicPartition("my_topic", 0)));

    HashMap<TopicPartition, Long> beginningOffsets = new HashMap<>();
    beginningOffsets.put(new TopicPartition("my_topic", 0), 0L);

    consumer.addRecord(new ConsumerRecord<String, String>("my_topic",
                       0, 0L, "mykey", "myvalue0"));
    consumer.addRecord(new ConsumerRecord<String, String>("my_topic", 0,
                       1L, "mykey", "myvalue1"));
    consumer.addRecord(new ConsumerRecord<String, String>("my_topic", 0,
                       2L, "mykey", "myvalue2"));
    consumer.addRecord(new ConsumerRecord<String, String>("my_topic", 0,
                       3L, "mykey", "myvalue3"));
    consumer.addRecord(new ConsumerRecord<String, String>("my_topic", 0,
                       4L, "mykey", "myvalue4"));

    // This is where you run YOUR consumer's code
    // This code will consume from the Consumer and do your logic on it

    // This just tests for exceptions
    // Somehow test what happens with the consume()

We start off by instantiating the Consumer we’re wanting to test. We inject our MockConsumer into the Consumer. Then, the MockConsumer's topic, partitions, and beginning offsets need to be set up. We send some data with the Consumer. All of the data added by the MockConsumer will be consumed by the Consumer. We call the addRecord() method for every ConsumerRecord we want the Consumer to see. Finally, we consume the data.

A quick note that this test only validates that the Consumer doesn’t throw an exception while processing this data. To verify the actual processing or output, you may need to mock another object or gather the output in a last and run your assertions.

Related Posts

zoomed in line graph photo

Data Teams Survey 2023 Follow-Up

Blog Summary: (AI Summaries by Summarizes)Many companies, regardless of size, are using data mesh as a methodology.Smaller companies may not necessarily need a data mesh

Laptop on a table showing a graph of data

Data Teams Survey 2023 Results

Blog Summary: (AI Summaries by Summarizes)A survey was conducted between January 24, 2023, and February 28, 2023, to gather data for the book “Data Teams”

Black and white photo of three corporate people discussing with a view of the city's buildings

Analysis of Confluent Buying Immerok

Blog Summary: (AI Summaries by Summarizes)Confluent has announced the acquisition of Immerok, which represents a significant shift in strategy for Confluent.The future of primarily ksqlDB

Tall modern buildings with the view of the ocean's horizon

Brief History of Data Engineering

Blog Summary: (AI Summaries by Summarizes)Google created MapReduce and GFS in 2004 for scalable systems.Apache Hadoop was created in 2005 by Doug Cutting based on

Big Data Institute horizontal logo

Independent Anniversary

Blog Summary: (AI Summaries by Summarizes)The author founded Big Data Institute eight years ago as an independent, big data consulting company.Independence allows for an unbiased