21. Mai 2021

Kafka Topic Design Guidelines

In our last weekly knowledge sharing session at inoio, we discussed our experiences and thoughts on how to design kafka topics. I.e. how to decide to which topic a (new) event type should be published. In customer projects sometimes we’ve seen that Kafka topics have not been chosen properly, so that later this had to be changed. Since mostly several systems / teams are affected by such a change, this causes quite some effort. So it is better to invest some time in this decision in advance. How to choose the Kafka topic for an event doesn’t seem to be discussed that much in the public, therefore we want to share our thoughts on that here.

At first an important thing to know: when consuming events, ordering is guaranteed for a topic partition. Because events are often assigned to a partition according to the key (the hash of the key), events must be published with the same key to the same topic (e.g. key = $userId, topic = user) to get a guaranteed ordering. Alternatively it’s possible to assign the partition “manually”.

Regarding guidelines at first let’s have a look at two extremes:

One topic per event type
One topic for all event types

One topic per event type #

one topic for each event

The consequence here is clear: when there’s a happens-before relationship between two events (respectively event types) E1 and E2, where E1 is published to topic T1 and E2 is published to T2, the order as seen by the consumer is not guaranteed. I.e. in the example above, it might happen that Service 3 consumes E2 before E1, although E1 actually happened before E2.

The advantages of that approach are that

consumers receive only events they’re interested in
fine grained data protection/access policies can be applied (via ACLs)
the topic configuration can be optimized for each event type, according to the workload

One topic for all event types #

This means that every consumer receives all events, even if it’s only interested in certain event types:

one topic for all event types

Depending on the workload and throughput this might lead to a delay in event processing in peak scenarios. To deal with such a situation the number of partitions (and optionally also the number of consumers per consumer group) could be increased - to scale out and parallelize event processing.

Another consequence is that the topic configuration (e.g. number of partitions, replication factor) has to match the requirements of maybe very different workloads. An optimization of these settings becomes harder.

The advantage of the extreme “global topic” approach is that the ordering for all event types is guaranteed (assuming that keys are properly chosen / partitions are assigned properly).

Conclusion / guidelines #

Based on these considerations we’d choose something in the middle of the two extremes. Here are points that could help to find a good balance:

A good boundary for topics could be the bounded context / domain or subdomain, that a topic or event type belongs to. I.e. if two events A and B belong to different domains, they should probably go into different topics.
If there’s a very strong relationship between two event types, they might go into the same topic (maybe regardless of the required ordering guarantees).
If there’s a strong requirement regarding the correct ordering of two events A and B, then they should be published to the same topic. The frequency and timely relatedness (like the delay between) A and B may also be relevant and considered.
Events containing very sensitive data, or events where access by services needs to be restricted severely, could be separated from other events with different access restrictions - i.e. these should be published to different topics.
An event with an extremely high throughput should probably be published to a dedicated topic.

If you have additional points, you disagree with some of the above or you have other feedback please let us know. In case you want to accelerate your project, feel free to call us. We are looking forward to support your project or team and are curious about your individual challenges we could solve together.

Martin Grotzke

architecture kafka

Kafka Topic Design Guidelines

One topic per event type #

One topic for all event types #

Conclusion / guidelines #

Kommentare

Esther Machata

Software Entwicklerin

Fabian Nilius

Señor Developer

Florian Warninghoff

Softwareentwickler

Jannik Theiß

Softwareentwickler

Markus Klink

Softwareentwickler

Martin Grotzke

Diplom-Informatiker (FH), Geschäftsführer

Matthias Brandt

Systemischer Coach und Berater

Moritz Wagner

DevOps-Typ und Softwareentwickler

Nadja Seehaus

Kommandofrau

Ole Langbehn

Diplom-Medieninformatiker (FH)

Patricia Rigorth

Betriebswirtin, Schwerpunkt Controlling

Sonja Allmaras-Bayerl

CFO, Systemische Beratung & Organisationsentwicklung (DGSF)

Thilo Schuchort

Softwareentwickler

Thomas Beckmann

Softwareentwickler

Caroline Kikisch

Systemische Beraterin (DGSF) und Mediatorin / Agile Coach

Daniel Bimschas

Informatiker (M.Sc.)