From a33a413d15af355272acf56f4a1f92ee8f1cdf0c Mon Sep 17 00:00:00 2001 From: mingdaoy Date: Tue, 25 Feb 2025 21:27:33 +0800 Subject: [PATCH] MINOR: Adjust docs for the committed message (#19022) Reviewers: Andrew Schofield --- docs/design.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/design.html b/docs/design.html index a59a83b5bfb..4d52277898a 100644 --- a/docs/design.html +++ b/docs/design.html @@ -263,8 +263,9 @@ Many systems claim to provide "exactly-once" delivery semantics, but it is important to read the fine print, because sometimes these claims are misleading (i.e. they don't translate to the case where consumers or producers can fail, cases where there are multiple consumer processes, or cases where data written to disk can be lost).

- Kafka's semantics are straightforward. When publishing a message we have a notion of the message being "committed" to the log. Once a published message is committed, it will not be lost as long as one broker that - replicates the partition to which this message was written remains "alive". The definition of committed message and alive partition as well as a description of which types of failures we attempt to handle will be + Kafka's semantics are straightforward. When publishing a message we have a notion of the message being "committed" to the log. A message is considered committed only when all replicas in the in-sync replicas (ISR) for that + partition have applied it to their log. Once a published message is committed, it will not be lost as long as one broker that replicates the partition to which this message was written remains "alive". + The definition of committed message and alive partition as well as a description of which types of failures we attempt to handle will be described in more detail in the next section. For now let's assume a perfect, lossless broker and try to understand the guarantees to the producer and consumer. If a producer attempts to publish a message and experiences a network error, it cannot be sure if this error happened before or after the message was committed. This is similar to the semantics of inserting into a database table with an autogenerated key.

@@ -380,7 +381,6 @@ In distributed systems terminology we only attempt to handle a "fail/recover" model of failures where nodes suddenly cease working and then later recover (perhaps without knowing that they have died). Kafka does not handle so-called "Byzantine" failures in which nodes produce arbitrary or malicious responses (perhaps due to bugs or foul play).

- We can now more precisely define that a message is considered committed when all replicas in the ISR for that partition have applied it to their log. Only committed messages are ever given out to the consumer. This means that the consumer need not worry about potentially seeing a message that could be lost if the leader fails. Producers, on the other hand, have the option of either waiting for the message to be committed or not, depending on their preference for tradeoff between latency and durability. This preference is controlled by the acks setting that the producer uses.