Skip to content
Go back

Consumer group in kafka

Table of contents

Open Table of contents

Introduction

In the world of queues there are consumers and producers and within producers we may want to achieve one of two things:

Consumer Group

A consumer group is a group of consumer that shares common group id. When a topic is consumed by consumers in the same group, every record will be delivered to only one consumer. This way kafka achieves parallel processing of records from a topic.

How does kafka achieve this?

Each topic consists of one or more partitions. When a consumer is started it will join a consumer group and then kafka will ensure that each partition is consumed by only one consumer from that group. So if we have a topic with 2 partitions and only one consumer in a group, that consumer will consume records from both partitions.

This also means that if we have 2 partitions and 3 consumers in the group then one of the consumer will sit idle.

Multiple instances of same service

If we have a order service which has multiple instances eg 3, and a topic order-notifications which has 2 partitions, order-service-1 and order-service-2 will own the active consumers in the order-service-consumer-group and order-service-3’s consumer will sit idle. The goal is to not process same record at multiple instances of the service else we will have bad outcomes and this is achieved by putting services under same consumer group.

Consumer offset

A record is uniquely identified by an offset in the partition. These offsets are used to track which record has been consumed by which consumer group. Consumers themselves are in charge of tracking this offset and broker does not know anything about it. Once the consumer reads the record it will store this offset in a special kafka topic called __consumer_offsets. When a consumer stores the offset in this topic, it’s known as committing the offset. This enables the consumer to always know which record should be consumed next from a given partition. Since the consumer offset is stored in kafka it means that the position of the consumer group is maintained event after restarts.


Share this post on:

Previous Post
Reentrant Locks in Java
Next Post
How to manage multiple github accounts