Kafka Delivery Semantics (At-least-once etc.)

Kafka’s delivery guarantees are often summarized as at-most-once, at-least-once, and exactly-once. The behaviour you get depends on producer acks, consumer commit policy, and (for exactly-once) idempotent producer and transactional writes. This article explains the three semantics, how to configure them, and the tradeoffs.

Overview

  • At-most-once: Producer does not retry or uses acks=0; consumer commits offset before processing. Messages can be lost (e.g. broker crash after send, or consumer crash after commit and before process); no duplicate delivery.
  • At-least-once: Producer retries and acks=all; consumer commits offset after processing. If consumer crashes after process but before commit, the same message can be processed again on restart → possible duplicates.
  • Exactly-once: Use idempotent producer (and optionally transactions) so that retries do not create duplicates, and consumer is idempotent or uses transactional read (commit offset with consumed output in one transaction). End-to-end exactly-once requires careful design (producer + consumer + storage).

Example

Example 1: At-least-once (common setup)

Properties
# Producer: retry and ack all in-sync replicas
acks=all
retries=3

# Consumer: process then commit
enable.auto.commit=false
# In code: process record → then consumer.commitSync();
  • If processing succeeds but commit fails (or consumer dies before commit), the same message is redelivered. So you get at-least-once; duplicates are possible.

Example 2: At-most-once (avoid duplicates, accept loss)

Properties
acks=0
# or acks=1 with no retries
# Consumer: commit before process
enable.auto.commit=true
auto.commit.interval.ms=0
# Then in code: commitSync() first, then process (or process and commit immediately after each)
  • Not common for critical data; use when loss is acceptable (e.g. metrics) and you want to avoid any duplicate processing.

Example 3: Idempotent producer (no duplicate records from retries)

Properties
enable.idempotence=true
# Implies acks=all, retries>0, max.in.flight.requests.per.connection=1 (or 5 with correct broker)
  • Broker deduplicates by producer ID + sequence; retries do not create duplicate records in the log. Consumer can still see the same record twice if it commits after process and then crashes before commit (at-least-once). Exactly-once consumer needs idempotent handling or transactional consumption.

Core Mechanism / Behavior

  • Producer: acks=0 (fire and forget), acks=1 (leader only), acks=all (in-sync replicas). Retries + acks=all give at-least-once at producer. Idempotence prevents duplicate records from retries.
  • Consumer: Commit offset = “I have processed up to here.” Commit before process → at-most-once (can lose); commit after process → at-least-once (can duplicate). Exactly-once needs idempotent side effects or transactional commit (e.g. offset in same DB transaction as output).
  • Transactions: For exactly-once read-process-write, use Kafka transactions: producer sends to output topic and consumer commits offset in one transactional commit. Out of scope for this short article; see official docs.
SemanticProducer config (typical)Consumer behaviourResult
At-most-onceacks=0 or no retryCommit before processMay lose, no duplicate
At-least-onceacks=all, retriesCommit after processNo lose, may duplicate
Exactly-onceidempotent + transactionsIdempotent or transactionalNo lose, no duplicate (with care)

Key Rules

  • Default to at-least-once (acks=all, commit after process) and make consumption idempotent (e.g. by message id or key) so duplicates are safe.
  • Enable idempotent producer to avoid duplicate records in the log from retries; this does not by itself give exactly-once consumption.
  • For exactly-once, combine idempotent producer, transactional producer/consumer, and idempotent or transactional side effects; design and test carefully.

What's Next

See Idempotency in Message Consumers for implementing idempotent consumers. See Kafka Core Concepts for topics, partitions, and offsets. See Ordering vs Throughput for batching and partitioning.