Kafka acks Configuration: Durability vs Latency (2026)

The Kafka producer's acks setting directly controls the trade-off between write durability and latency. The choice is not simply fast vs. slow — it must be made with the relationship between ISR and min.insync.replicas (MinISR) in mind.

This article aligns with the official behaviors commonly tested on the CCDAK (Confluent Certified Developer for Apache Kafka) exam and distills them into practical decision criteria. Version-sensitive details are treated conservatively with explicit cautions.

acks Basics and the ISR / MinISR Relationship

acks specifies at which stage the producer waits for an acknowledgment after sending. acks=0 waits for nothing, acks=1 waits only for the leader, and acks=all (= -1) waits for ACKs from every ISR (In-Sync Replicas) member. The ISR is the leader plus the set of up-to-date followers, not all assigned replicas.

min.insync.replicas (MinISR) becomes meaningful with acks=all and defines, on the server side, the minimum ISR size required to consider a write successful. For example, on a topic with replication factor 3 and MinISR=2, writing with acks=all succeeds when the ISR has at least 2 members but fails with a NotEnoughReplicas-class error when only 1 remains. This enforces a durability floor at the cost of becoming unwritable under some failure scenarios.

acks=0: shortest latency. No error is returned even if the broker is unreachable or the leader fails. Suitable for loss-tolerant use cases.
acks=1: success once the leader receives the record. A narrow window of loss exists right after a leader failure, but latency is generally low.
acks=all: latency can rise because the producer waits for the slowest ISR node, but combined with MinISR it lets you define a durability floor.

Setting	Durability	Latency	Availability (RF=3, MinISR=2)
acks=0	Weakest (treated as success even on non-delivery)	Shortest	High (no waiting, minimal impact)
acks=1	Medium (loss possible right after a leader failure)	Low to medium	High (depends only on the leader)
acks=all	Strong (depends on ISR size)	Medium to high (dragged by the slowest ISR member)	OK with ISR>=2, error when ISR<2

acks and replication response flow (conceptual diagram)

Producer --> Leader P
              |
              v
followers: F1, F2

acks=0:   P -send-> L   (即返答)  [F1/F2待ちなし]
acks=1:   P -send-> L -ACK-> P    [F1/F2の複製は非同期]
acks=all: P -send-> L -> F1/F2
                         ^   ^
                         |   |
                     ACK(F1,F2 in ISR)
                         \   /
                          --> L -ACK-> P

Relationship of key settings (conceptual notes)

プロデューサ側: acks, retries, delivery.timeout.ms, request.timeout.ms, enable.idempotence
ブローカー/トピック側: replication.factor, min.insync.replicas, unclean.leader.election.enable
注意: acks=all は ISR 全員の ACK。MinISR を満たせないとプロデューサはエラー（NotEnoughReplicas など）。

acks=0: Shortest Latency but No Guarantees

acks=0 treats the send as successful immediately, so the network round-trip is essentially absent from latency. The fundamental risk is that you cannot detect failed delivery to the broker or dropped batches. Increasing retries has limited effect because no error is returned.

It is safer to restrict its use to loss-tolerant telemetry or transient clickstreams. Make sure downstream aggregation is idempotent and estimable, and that the overall design tolerates loss.

Pros: near-zero round-trip latency, maximum throughput
Cons: loss is undetectable; retries and ordering guarantees lose meaning
Avoid for: financial transactions, orders, audit trails, and other critical events

Aspect	Behavior under acks=0	Operational notes	Alternative
Error detection	Not possible	Only indirect signals like sender-side queue exhaustion are observable	acks=1 or higher
Retries	Limited effect	Only effective for pre-send exceptions	Enable with acks=1 or all
Ordering	Records are just sent in submission order	Non-delivery on the server side is still treated as success	Lean toward an idempotent design

Shortest path under acks=0

Producer -> Leader (即成功返却)
           (フォロワー複製は結果に影響しない)

失敗: ネットワーク断/リーダーダウンでも Producer は未検知

Producer configuration example (acks=0)

props.put("acks", "0");
props.put("linger.ms", 0);
props.put("batch.size", 131072);
props.put("retries", 0); // 効果は限定的
// ロス許容の用途に限定して使用

acks=1: A Balanced Mode Waiting for Leader Confirmation

acks=1 returns an ACK as soon as the leader receives the record and appends it to the log. Because replication to followers is asynchronous, unreplicated data can be lost in a narrow window right after a leader failure. The balance is adequate for many workloads but does not meet strict durability requirements.

Latency tends to be lower than with acks=all and availability is higher, but shrinking the failure window relies on fast follower replication (healthy networking and storage, appropriate load).

Pros: low latency, broad availability, errors are detectable
Cons: a data-loss window remains right after a leader failure
Mitigations: retries, tuning delivery.timeout.ms, and monitoring to keep replication lag low

Timing	Leader state	Follower replication	Result
Just after ACK, followers not yet caught up	Up	Incomplete	Subsequent failure may cause loss
After followers have caught up	Up	Complete	Leader failover preserves data even on failure
Failure before send	Down	None	Client can recover via retries

Failure window under acks=1 (conceptual timeline)

t0: send -> Leader append -> ACK to Producer
     |<--- 未複製ウィンドウ --->|
フォロワー追従完了 t1

故障が [t0, t1) に発生: 未複製データは失われ得る

Safer companion configuration example

props.put("acks", "1");
props.put("retries", Integer.MAX_VALUE);
props.put("delivery.timeout.ms", 120000);
props.put("request.timeout.ms", 30000);
// 再送中の順序性はクライアント/バージョンに依存。必要なら max.in.flight を下げるなどで制御

acks=all (= -1): Establishing Durability with MinISR

acks=all waits for ACKs from every ISR member. Latency is pulled by the slowest ISR member, but combined with MinISR you can enforce a durability floor of "fail unless replicated to at least N nodes".

When the ISR drops below MinISR, the producer's write fails and the broker may return errors such as NotEnoughReplicas or NotEnoughReplicasAfterAppend. This design accepts reduced availability in exchange for preventing data loss.

acks=all is the baseline for critical events and transactions
MinISR must be less than replication.factor (e.g. MinISR=2 is typical for RF=3)
Agree up front whether operations can tolerate temporary unwritability during failures

RF	MinISR	Current ISR	Result under acks=all
3	2	3	Success (latency depends on slowest ISR member)
3	2	2	Success (just barely allowed)
3	2	1	Failure (error because below MinISR)

acks=all with ISR / MinISR

RF=3, MinISR=2
ISR={Leader, F1, F2}

ケースA: ISR=3 -> L<=F1, F2 両方ACK -> L ACK -> P (成功)
ケースB: ISR=2 -> L<=F1のみACK -> L ACK -> P (成功)
ケースC: ISR=1 -> フォロワーACK不足 -> エラー返却

Configuration on both the topic and the client side

# トピックで MinISR を設定
kafka-topics --alter --topic critical-events \
  --config min.insync.replicas=2 --bootstrap-server <broker>

# プロデューサ（例）
props.put("acks", "all");
props.put("retries", Integer.MAX_VALUE);
props.put("delivery.timeout.ms", 120000);
props.put("enable.idempotence", true);  // 厳密な重複排除が必要な場合
# 注意: 具体的なデフォルト値はクライアント/バージョンで異なるため、利用バージョンの公式ドキュメントを確認

Surrounding Settings and Monitoring That Affect Latency

Beyond acks, linger.ms and batch.size control the sender-side wait and batching. With many small messages, raising linger modestly to batch records is an effective way to amortize networking and syscall overhead in practice.

delivery.timeout.ms is the overall retry budget, and request.timeout.ms is the wait per request. Compression (compression.type) and appropriate partition counts also affect latency and throughput. Always surface producer-side request-latency and broker-side replica.lag in metrics to continuously validate that your acks choice is appropriate.

Many small messages: consider linger.ms in the 5-20 ms range
Few large messages: keep linger near 0 and increase batch.size
Network-latency dominated: enable compression and batch records to reduce round-trips
Severe replica lag: acks=all increases latency — first resolve the cause (I/O latency, GC, slow brokers)

Setting	Primary effect	Side effects / cautions	Application guideline
linger.ms	Increases send wait, improves batch efficiency	Adds latency equal to the wait	Tune starting from 5-20 ms
batch.size	Increases data per request	Too large increases wait time	Start around 64-256 KB
compression.type	Saves bandwidth, increases CPU usage	Watch for CPU bottlenecks	Compare snappy vs. zstd
delivery.timeout.ms	Overall retry budget	Too short leads to premature failures	>= 2 × request.timeout.ms

Conceptual decomposition of latency

送信待ち(linger) |-> 送信/待機(RTT×往復) |-> ブローカー処理 |-> フォロワー複製(acks依存)
               |----------------- クライアント側 ------------------| |------ サーバ側 ------|

Latency budgeting (rough estimate)

# 例: RTT=5ms, linger=10ms, ブローカー処理=2ms, フォロワー複製=8ms
# acks=0:  ~ linger(10) + 送信(OS/キュー) ≈ 10–12ms
# acks=1:  linger(10) + RTT(5) + 処理(2) ≈ 17ms
# acks=all: linger(10) + RTT(5) + 処理(2) + 複製(8) ≈ 25ms
# 実際はバッチ化・負荷・GC・I/O で変動。メトリクスで補正すること

CCDAK Exam Perspective and Operational Checklist

On the CCDAK, classic questions cover the relationship between acks, MinISR, and ISR, the outcomes during failures, and combinations with the idempotent producer and transactions. It is especially important to understand that "acks=all means every ISR member, not every assigned replica" and that "writes fail when ISR drops below MinISR".

In production, it is common to disable unclean.leader.election.enable to avoid data loss. Because defaults and exact behaviors are version-dependent, always confirm against the official documentation for the version running in your environment.

If the requirement is zero loss, consider acks=all paired with MinISR>=2 (RF>=3)
If "writes must continue during outages" is paramount, consider acks=1 and minimize the failure window via monitoring
When using idempotence or transactions, acks=all is a prerequisite (some clients auto-adjust to it)
Agree with stakeholders on expected behavior during failure (fail fast vs. accept delay)

Scenario	Recommended acks	Additional conditions	Metrics to monitor
Critical events (no loss)	all	RF>=3, MinISR>=2, idempotence enabled	replica.lag, request-latency
General workloads (balanced)	1	Large retries, monitor delivery delays	failed-requests-rate
Loss tolerant, shortest latency	0	Compensate via audit design	Send queue backlog, drop rate (downstream)

Simple decision flow for choosing acks

ロス許容? --yes--> acks=0
    |
    no
    v
停止中も可用性優先? --yes--> acks=1
                        |
                        no
                        v
                        acks=all (+ MinISR 設定)

Configuration templates (by profile)

# Durable（ロス不可）
acks=all
retries=Integer.MAX_VALUE
delivery.timeout.ms=120000
enable.idempotence=true  # 必要に応じて transactions も

# Balanced（一般業務）
acks=1
retries=Integer.MAX_VALUE
delivery.timeout.ms=120000

# Ultra-low-latency（ロス許容）
acks=0
linger.ms=0
batch.size=131072

Check Your Understanding

CCDAK

問題 1

A topic has replication factor 3 and min.insync.replicas=2. You need to write critical payment events with minimal loss, and writes must continue even when one broker is taken offline for planned maintenance. Which configuration best meets these requirements?

Producer: acks=all, enable.idempotence=true. Broker/topic: keep min.insync.replicas=2.
Producer: acks=1. Broker/topic: lower min.insync.replicas to 1.
Producer: acks=0. Broker/topic: no changes.
Producer: acks=all. Broker/topic: raise min.insync.replicas to 3.

正解: A

With RF=3 and MinISR=2, taking one broker offline still typically leaves ISR=2, and acks=all confirms replication to at least two nodes before success — meeting the durability/availability balance. B reduces durability; C cannot even detect loss; D loses availability because writes fail the moment ISR drops to 2. Idempotence reinforces deduplication and ordering.

Frequently Asked Questions

Does acks=all mean writes are completed to ALL replicas?

No. acks=all means ACKs from every member of the ISR (In-Sync Replicas), not all assigned replicas. Lagging followers that have dropped out of the ISR are not part of the condition.

Does acks=all guarantee zero loss even under disk failure?

Not absolutely. Kafka does not force a synchronous fsync per message, so replication is the primary defense against a single-node failure. Combining acks=all with an appropriate MinISR (e.g. 2) ensures success only after the record is replicated to multiple nodes, which raises durability, but the worst-case design risk cannot be reduced to zero.

How do acks relate to the idempotent producer and transactions?

When enabling idempotence or transactions, acks=all is a prerequisite (or the client automatically configures equivalent behavior). The exact defaults and constraints depend on the client and version, so verify the official documentation for the Kafka/Confluent client you are using.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Choosing Kafka Producer acks=0 / 1 / all: Practical Design for Durability and Latency