Through its transaction and state store machinery, Kafka Streams can align input offsets, internal state, and outputs into a single atomic write unit. This is what Exactly-Once Semantics (EOS) actually delivers.
This article walks through the right way to pick processing.guarantee, plus the configuration concerns at the app, broker, and topic levels. It also calls out the pitfalls that show up repeatedly on the CCDAK (Confluent Certified Developer for Apache Kafka) exam.
EOS in Kafka Streams bundles the following into a single transaction per processing cycle, eliminating duplicates and inconsistencies:
1) Consumer offset commits for the inputs, 2) internal state store changes (RocksDB ⇄ changelog), and 3) record writes to output topics. On failures, restarts, or rebalances, these are either all applied or all rolled back — never partially.
Each processing.guarantee mode changes whether duplicates can occur, how offsets are committed, and the state-store consistency model. The exam specifically targets candidates who mix up the values and their semantics.
In production, exactly_once_v2 is the default choice unless you are emitting side effects directly to external systems. Switch to at_least_once only when latency or throughput requirements demand it.
| Mode | Duplicates | Offset commit | State / output consistency |
|---|---|---|---|
| at_least_once | Possible (reprocessing can duplicate) | Normal group commit (outside any transaction) | Non-atomic (partial application possible) |
| exactly_once_v2 | None (no duplicates within Kafka) | Sent inside the transaction via SendOffsetsToTransaction | State updates, outputs, and offsets all committed atomically |
Under EOS v2, Streams uses one producer per stream thread and pulls output topics, the internal changelog, and the input offset commits into a single transaction. On commit, everything becomes visible atomically; on abort, everything is discarded.
When a thread is replaced after a rebalance or crash, producer fencing blocks commits from any zombie producer, keeping the system consistent.
Transaction boundaries under EOS v2 (conceptual diagram)
Start by setting processing.guarantee=exactly_once_v2 in the app, and keep application.id stable (unchanged across redeploys). Specify replication.factor as well to give internal topics proper durability.
On the broker side, give the transaction log enough replication and a sufficient min ISR. Keep the transaction timeout within the broker's upper limit.
Kafka Streams (Java) configuration example: Exactly-Once v2
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "orders-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "broker1:9092,broker2:9092");
// Exactly-Once v2 を明示
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE_V2);
// 内部トピックの耐障害性(StreamsConfig.TOPIC_PREFIX を利用)
props.put(StreamsConfig.TOPIC_PREFIX + "replication.factor", 3);
// トランザクション関連(producer.* で上書き可能)
props.put("producer.transaction.timeout.ms", 600000); // 10 分(broker 上限以下に)
// レイテンシ/スループット調整の一例
props.put("producer.linger.ms", 5);
props.put("producer.batch.size", 32768);
// コミット間隔(EOS では短めでも一貫性は保たれる)
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 100);
// スレッド並列度
props.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 2);
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> orders = builder.stream("orders");
KTable<String, Long> counts = orders.groupByKey().count();
counts.toStream().to("orders-agg");
KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();On a graceful shutdown, you must commit or abort the current transaction before stopping. Use SIGTERM handling and close(Duration) to do this.
During a rebalance, the system has to wait for in-flight transactions to settle before tasks move. Tuning the thread count and input partitioning to keep transactions short keeps operations stable.
Choosing processing.guarantee=exactly_once_v2 commits input offsets, internal state updates, and outputs atomically in the same transaction. External systems are not included.
In Streams, you do not normally assign transactional.id yourself — it is generated and managed internally. The non-negotiable rule is to keep application.id stable.
CCDAK
問題 1
A Kafka Streams app aggregates orders and writes the results to a single output topic. You want internal state updates, outputs, and input offsets to commit atomically across restarts and rebalances, and to avoid any duplicates inside Kafka. Which combination of settings and prerequisites is correct?
正解: A
Enabling Exactly-Once requires setting processing.guarantee=exactly_once_v2 on the Streams side and keeping application.id stable. The broker must provide adequate replication and ISR for the transaction log (transaction.state.log.*). B ignores the Streams context, C has nothing to do with duplicate prevention, and D does not guarantee application-level consistency.
Is exactly_once_v2 always the default?
No. In most versions the default is at_least_once. If you are designing around Exactly-Once, explicitly set processing.guarantee=exactly_once_v2.
Does EOS also make writes to external databases atomic?
No. EOS is limited to Kafka-internal effects: output topics, internal topics, and input offsets. For external databases, combine it with two-phase commit, the outbox pattern, or unique constraints / idempotent APIs on the external side.
What happens when a transaction times out?
The transaction is aborted, so the outputs, changelog updates, and offset commits in that batch never become visible. The processing is retried. Setting transaction.timeout.ms appropriately and keeping per-batch processing time short helps.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...