Kafka

Choosing Kafka Producer acks=0 / 1 / all: Practical Design for Durability and Latency

2026-04-19
NicheeLab Editorial Team

The Kafka producer's acks setting directly controls the trade-off between write durability and latency. The choice is not simply fast vs. slow — it must be made with the relationship between ISR and min.insync.replicas (MinISR) in mind.

This article aligns with the official behaviors commonly tested on the CCDAK (Confluent Certified Developer for Apache Kafka) exam and distills them into practical decision criteria. Version-sensitive details are treated conservatively with explicit cautions.

acks Basics and the ISR / MinISR Relationship

acks specifies at which stage the producer waits for an acknowledgment after sending. acks=0 waits for nothing, acks=1 waits only for the leader, and acks=all (= -1) waits for ACKs from every ISR (In-Sync Replicas) member. The ISR is the leader plus the set of up-to-date followers, not all assigned replicas.

min.insync.replicas (MinISR) becomes meaningful with acks=all and defines, on the server side, the minimum ISR size required to consider a write successful. For example, on a topic with replication factor 3 and MinISR=2, writing with acks=all succeeds when the ISR has at least 2 members but fails with a NotEnoughReplicas-class error when only 1 remains. This enforces a durability floor at the cost of becoming unwritable under some failure scenarios.

  • acks=0: shortest latency. No error is returned even if the broker is unreachable or the leader fails. Suitable for loss-tolerant use cases.
  • acks=1: success once the leader receives the record. A narrow window of loss exists right after a leader failure, but latency is generally low.
  • acks=all: latency can rise because the producer waits for the slowest ISR node, but combined with MinISR it lets you define a durability floor.
SettingDurabilityLatencyAvailability (RF=3, MinISR=2)
acks=0Weakest (treated as success even on non-delivery)ShortestHigh (no waiting, minimal impact)
acks=1Medium (loss possible right after a leader failure)Low to mediumHigh (depends only on the leader)
acks=allStrong (depends on ISR size)Medium to high (dragged by the slowest ISR member)OK with ISR>=2, error when ISR<2

acks and replication response flow (conceptual diagram)

Producer --> Leader P
              |
              v
followers: F1, F2

acks=0:   P -send-> L   (即返答)  [F1/F2待ちなし]
acks=1:   P -send-> L -ACK-> P    [F1/F2の複製は非同期]
acks=all: P -send-> L -> F1/F2
                         ^   ^
                         |   |
                     ACK(F1,F2 in ISR)
                         \   /
                          --> L -ACK-> P

Relationship of key settings (conceptual notes)

プロデューサ側: acks, retries, delivery.timeout.ms, request.timeout.ms, enable.idempotence
ブローカー/トピック側: replication.factor, min.insync.replicas, unclean.leader.election.enable
注意: acks=all は ISR 全員の ACK。MinISR を満たせないとプロデューサはエラー(NotEnoughReplicas など)。

acks=0: Shortest Latency but No Guarantees

acks=0 treats the send as successful immediately, so the network round-trip is essentially absent from latency. The fundamental risk is that you cannot detect failed delivery to the broker or dropped batches. Increasing retries has limited effect because no error is returned.

It is safer to restrict its use to loss-tolerant telemetry or transient clickstreams. Make sure downstream aggregation is idempotent and estimable, and that the overall design tolerates loss.

  • Pros: near-zero round-trip latency, maximum throughput
  • Cons: loss is undetectable; retries and ordering guarantees lose meaning
  • Avoid for: financial transactions, orders, audit trails, and other critical events
AspectBehavior under acks=0Operational notesAlternative
Error detectionNot possibleOnly indirect signals like sender-side queue exhaustion are observableacks=1 or higher
RetriesLimited effectOnly effective for pre-send exceptionsEnable with acks=1 or all
OrderingRecords are just sent in submission orderNon-delivery on the server side is still treated as successLean toward an idempotent design

Shortest path under acks=0

Producer -> Leader (即成功返却)
           (フォロワー複製は結果に影響しない)

失敗: ネットワーク断/リーダーダウンでも Producer は未検知

Producer configuration example (acks=0)

props.put("acks", "0");
props.put("linger.ms", 0);
props.put("batch.size", 131072);
props.put("retries", 0); // 効果は限定的
// ロス許容の用途に限定して使用

acks=1: A Balanced Mode Waiting for Leader Confirmation

acks=1 returns an ACK as soon as the leader receives the record and appends it to the log. Because replication to followers is asynchronous, unreplicated data can be lost in a narrow window right after a leader failure. The balance is adequate for many workloads but does not meet strict durability requirements.

Latency tends to be lower than with acks=all and availability is higher, but shrinking the failure window relies on fast follower replication (healthy networking and storage, appropriate load).

  • Pros: low latency, broad availability, errors are detectable
  • Cons: a data-loss window remains right after a leader failure
  • Mitigations: retries, tuning delivery.timeout.ms, and monitoring to keep replication lag low
TimingLeader stateFollower replicationResult
Just after ACK, followers not yet caught upUpIncompleteSubsequent failure may cause loss
After followers have caught upUpCompleteLeader failover preserves data even on failure
Failure before sendDownNoneClient can recover via retries

Failure window under acks=1 (conceptual timeline)

t0: send -> Leader append -> ACK to Producer
     |<--- 未複製ウィンドウ --->|
フォロワー追従完了 t1

故障が [t0, t1) に発生: 未複製データは失われ得る

Safer companion configuration example

props.put("acks", "1");
props.put("retries", Integer.MAX_VALUE);
props.put("delivery.timeout.ms", 120000);
props.put("request.timeout.ms", 30000);
// 再送中の順序性はクライアント/バージョンに依存。必要なら max.in.flight を下げるなどで制御

acks=all (= -1): Establishing Durability with MinISR

acks=all waits for ACKs from every ISR member. Latency is pulled by the slowest ISR member, but combined with MinISR you can enforce a durability floor of "fail unless replicated to at least N nodes".

When the ISR drops below MinISR, the producer's write fails and the broker may return errors such as NotEnoughReplicas or NotEnoughReplicasAfterAppend. This design accepts reduced availability in exchange for preventing data loss.

  • acks=all is the baseline for critical events and transactions
  • MinISR must be less than replication.factor (e.g. MinISR=2 is typical for RF=3)
  • Agree up front whether operations can tolerate temporary unwritability during failures
RFMinISRCurrent ISRResult under acks=all
323Success (latency depends on slowest ISR member)
322Success (just barely allowed)
321Failure (error because below MinISR)

acks=all with ISR / MinISR

RF=3, MinISR=2
ISR={Leader, F1, F2}

ケースA: ISR=3 -> L<=F1, F2 両方ACK -> L ACK -> P (成功)
ケースB: ISR=2 -> L<=F1のみACK -> L ACK -> P (成功)
ケースC: ISR=1 -> フォロワーACK不足 -> エラー返却

Configuration on both the topic and the client side

# トピックで MinISR を設定
kafka-topics --alter --topic critical-events \
  --config min.insync.replicas=2 --bootstrap-server <broker>

# プロデューサ(例)
props.put("acks", "all");
props.put("retries", Integer.MAX_VALUE);
props.put("delivery.timeout.ms", 120000);
props.put("enable.idempotence", true);  // 厳密な重複排除が必要な場合
# 注意: 具体的なデフォルト値はクライアント/バージョンで異なるため、利用バージョンの公式ドキュメントを確認

Surrounding Settings and Monitoring That Affect Latency

Beyond acks, linger.ms and batch.size control the sender-side wait and batching. With many small messages, raising linger modestly to batch records is an effective way to amortize networking and syscall overhead in practice.

delivery.timeout.ms is the overall retry budget, and request.timeout.ms is the wait per request. Compression (compression.type) and appropriate partition counts also affect latency and throughput. Always surface producer-side request-latency and broker-side replica.lag in metrics to continuously validate that your acks choice is appropriate.

  • Many small messages: consider linger.ms in the 5-20 ms range
  • Few large messages: keep linger near 0 and increase batch.size
  • Network-latency dominated: enable compression and batch records to reduce round-trips
  • Severe replica lag: acks=all increases latency — first resolve the cause (I/O latency, GC, slow brokers)
SettingPrimary effectSide effects / cautionsApplication guideline
linger.msIncreases send wait, improves batch efficiencyAdds latency equal to the waitTune starting from 5-20 ms
batch.sizeIncreases data per requestToo large increases wait timeStart around 64-256 KB
compression.typeSaves bandwidth, increases CPU usageWatch for CPU bottlenecksCompare snappy vs. zstd
delivery.timeout.msOverall retry budgetToo short leads to premature failures>= 2 × request.timeout.ms

Conceptual decomposition of latency

送信待ち(linger) |-> 送信/待機(RTT×往復) |-> ブローカー処理 |-> フォロワー複製(acks依存)
               |----------------- クライアント側 ------------------| |------ サーバ側 ------|

Latency budgeting (rough estimate)

# 例: RTT=5ms, linger=10ms, ブローカー処理=2ms, フォロワー複製=8ms
# acks=0:  ~ linger(10) + 送信(OS/キュー) ≈ 10–12ms
# acks=1:  linger(10) + RTT(5) + 処理(2) ≈ 17ms
# acks=all: linger(10) + RTT(5) + 処理(2) + 複製(8) ≈ 25ms
# 実際はバッチ化・負荷・GC・I/O で変動。メトリクスで補正すること

CCDAK Exam Perspective and Operational Checklist

On the CCDAK, classic questions cover the relationship between acks, MinISR, and ISR, the outcomes during failures, and combinations with the idempotent producer and transactions. It is especially important to understand that "acks=all means every ISR member, not every assigned replica" and that "writes fail when ISR drops below MinISR".

In production, it is common to disable unclean.leader.election.enable to avoid data loss. Because defaults and exact behaviors are version-dependent, always confirm against the official documentation for the version running in your environment.

  • If the requirement is zero loss, consider acks=all paired with MinISR>=2 (RF>=3)
  • If "writes must continue during outages" is paramount, consider acks=1 and minimize the failure window via monitoring
  • When using idempotence or transactions, acks=all is a prerequisite (some clients auto-adjust to it)
  • Agree with stakeholders on expected behavior during failure (fail fast vs. accept delay)
ScenarioRecommended acksAdditional conditionsMetrics to monitor
Critical events (no loss)allRF>=3, MinISR>=2, idempotence enabledreplica.lag, request-latency
General workloads (balanced)1Large retries, monitor delivery delaysfailed-requests-rate
Loss tolerant, shortest latency0Compensate via audit designSend queue backlog, drop rate (downstream)

Simple decision flow for choosing acks

ロス許容? --yes--> acks=0
    |
    no
    v
停止中も可用性優先? --yes--> acks=1
                        |
                        no
                        v
                        acks=all (+ MinISR 設定)

Configuration templates (by profile)

# Durable(ロス不可)
acks=all
retries=Integer.MAX_VALUE
delivery.timeout.ms=120000
enable.idempotence=true  # 必要に応じて transactions も

# Balanced(一般業務)
acks=1
retries=Integer.MAX_VALUE
delivery.timeout.ms=120000

# Ultra-low-latency(ロス許容)
acks=0
linger.ms=0
batch.size=131072

Check Your Understanding

CCDAK

問題 1

A topic has replication factor 3 and min.insync.replicas=2. You need to write critical payment events with minimal loss, and writes must continue even when one broker is taken offline for planned maintenance. Which configuration best meets these requirements?

  1. Producer: acks=all, enable.idempotence=true. Broker/topic: keep min.insync.replicas=2.
  2. Producer: acks=1. Broker/topic: lower min.insync.replicas to 1.
  3. Producer: acks=0. Broker/topic: no changes.
  4. Producer: acks=all. Broker/topic: raise min.insync.replicas to 3.

正解: A

With RF=3 and MinISR=2, taking one broker offline still typically leaves ISR=2, and acks=all confirms replication to at least two nodes before success — meeting the durability/availability balance. B reduces durability; C cannot even detect loss; D loses availability because writes fail the moment ISR drops to 2. Idempotence reinforces deduplication and ordering.

Frequently Asked Questions

Does acks=all mean writes are completed to ALL replicas?

No. acks=all means ACKs from every member of the ISR (In-Sync Replicas), not all assigned replicas. Lagging followers that have dropped out of the ISR are not part of the condition.

Does acks=all guarantee zero loss even under disk failure?

Not absolutely. Kafka does not force a synchronous fsync per message, so replication is the primary defense against a single-node failure. Combining acks=all with an appropriate MinISR (e.g. 2) ensures success only after the record is replicated to multiple nodes, which raises durability, but the worst-case design risk cannot be reduced to zero.

How do acks relate to the idempotent producer and transactions?

When enabling idempotence or transactions, acks=all is a prerequisite (or the client automatically configures equivalent behavior). The exact defaults and constraints depend on the client and version, so verify the official documentation for the Kafka/Confluent client you are using.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.