Table of contents
Open Table of contents
Introduction
In the world of queues there are consumers and producers and within producers we may want to achieve one of two things:
- All consumers consume all messages
- Only some targeted consumers consume some messages Both can be achieved with kafka.
Consumer Group
A consumer group is a group of consumer that shares common group id. When a topic is consumed by consumers in the same group, every record will be delivered to only one consumer. This way kafka achieves parallel processing of records from a topic.
How does kafka achieve this?
Each topic consists of one or more partitions. When a consumer is started it will join a consumer group and then kafka will ensure that each partition is consumed by only one consumer from that group. So if we have a topic with 2 partitions and only one consumer in a group, that consumer will consume records from both partitions.
This also means that if we have 2 partitions and 3 consumers in the group then one of the consumer will sit idle.
Multiple instances of same service
If we have a order service which has multiple instances eg 3, and a topic order-notifications which has 2 partitions, order-service-1
and order-service-2 will own the active consumers in the order-service-consumer-group and order-service-3’s consumer will sit idle.
The goal is to not process same record at multiple instances of the service else we will have bad outcomes and this is achieved by
putting services under same consumer group.
Consumer offset
A record is uniquely identified by an offset in the partition. These offsets are used to track which record has been consumed
by which consumer group. Consumers themselves are in charge of tracking this offset and broker does not know anything about it.
Once the consumer reads the record it will store this offset in a special kafka topic called __consumer_offsets. When a consumer stores
the offset in this topic, it’s known as committing the offset.
This enables the consumer to always know which record should be consumed next from a given partition. Since the consumer offset is stored in kafka
it means that the position of the consumer group is maintained event after restarts.