Kafka retries improve availability, but left as-is, broker retries can cause duplicate writes. Idempotent Producer is the canonical feature that achieves "no duplicates within the same partition" purely on the producer side.
This article covers the internals of enable.idempotence, required/recommended settings, failure scenarios, operations, and exam prep. Wording is kept careful where behavior varies by version. For specifics, follow the official Kafka and Confluent documentation.
When a producer detects a send failure due to a network outage or leader change, it retries. With plain retries alone, the same record can be written twice to the same partition. Filtering this out at the application layer is costly, and in queuing or payment systems it can be fatal.
Idempotent Producer prevents duplicate acceptance at the broker using a producer ID and sequence number. This keeps inserts within a partition unique even when retries occur. CCDAK frequently tests the "required combination of settings" and "scope of guarantees".
Idempotent Producer identifies each record batch with a ProducerId (PID) assigned by the broker and a monotonically increasing per-partition sequence number. The broker (leader) retains received PID and sequence numbers, discards known combinations as duplicates, and only accepts batches in the correct order.
Because acks=all waits for replication across the full ISR before acknowledging success, it suppresses double visibility of the same record even across network outages or leader changes. The producer auto-recovers from errors like OutOfOrderSequence or UnknownProducerId, bumping the epoch and re-initializing when needed.
Idempotent Producer deduplication flow (conceptual)
To enable Idempotent Producer, enable.idempotence=true is the core setting. To stay consistent, you also need acks=all, retries>0 (effectively a large value in most implementations), and max.in.flight.requests.per.connection within the constraint (generally 5 or less). Depending on the client, conflicting settings may raise an error or be auto-overridden/limited. The exam often tests these dependency relationships.
delivery.timeout.ms caps the total time across all retries. Once exceeded, the send is treated as a failure, but duplicates from intermediate retries are still prevented by Idempotent Producer. For throughput optimization, combine linger.ms and batch.size, but balance against your latency budget. Managed environments such as Confluent Cloud typically assume acks=all and support Idempotent Producer by default.
Note: defaults and auto-adjustment behavior can vary by client and version. Always check the latest official documentation.
| Delivery Guarantee Mode | Key Settings | Duplicates | Ordering |
|---|---|---|---|
| At-most-once | acks=0 or acks=1, retries=0 | None, but high risk of loss | Weak guarantees |
| At-least-once | acks>=1, retries>0, enable.idempotence=false | Possible (retries can duplicate) | Can be tightened with max.in.flight=1 |
| Idempotent | enable.idempotence=true, acks=all, retries>0, max.in.flight<=5 | Prevented within the same partition | Strong (per partition) |
| Transactional EOS | Idempotent plus transactional.id, with proper consumer isolation | Prevented end-to-end (can span multiple partitions) | Strong (transaction boundary) |
Minimal Java Producer example (idempotence + throughput-aware)
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// Core of idempotence
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, Integer.toString(Integer.MAX_VALUE));
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "5"); // ordering and consistency
// throughput and stability
props.put(ProducerConfig.LINGER_MS_CONFIG, "20");
props.put(ProducerConfig.BATCH_SIZE_CONFIG, "65536");
props.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, "120000");
KafkaProducer<String, String> producer = new KafkaProducer<>(props);
ProducerRecord<String, String> rec = new ProducerRecord<>("orders", "order-123", "payload");
producer.send(rec, (md, ex) -> {
if (ex != null) {
// Supplementary logging on failure; duplicates are prevented by idempotence
ex.printStackTrace();
}
});
producer.flush();
producer.close();Network outages, leader changes, and log truncation can cause state between producer and broker to drift. Idempotent Producer attempts recovery via epoch management and sequence control, but application-side retry design and monitoring are still important.
Typical errors and their behavior are listed below. The exact trigger conditions depend on broker and client versions and configuration, so test against your own operating environment.
Whether idempotence is on or off is clear from configuration, but in production you must continuously monitor retry rate, throttling, and latency trends. Dashboard producer metrics (record-error-rate, record-retry-rate, request-latency-avg, produce-throttle-time-avg, batch-size-avg, records-per-request-avg, and so on) and review them alongside the count of delivery.timeout.ms errors.
The baseline for throughput optimization is to strengthen batching with linger.ms and batch.size while securing parallelism under the max.in.flight<=5 constraint. Key design matters: routing the same key to the same partition stably also makes application-level consistency easier to maintain.
In managed environments (such as Confluent Cloud), acks=all is assumed and recommended, and Idempotent Producer is available as standard. Client-side defaults and constraints can be updated over time, so check the docs for your specific environment when adopting it.
The exam tends to focus on Idempotent Producer's guarantee scope, the required combination of settings, and how it differs from transactions. Be clear that simply raising retries does not prevent duplicates, understand what acks=all means, and know the max.in.flight constraint.
Also remember that Idempotent Producer is "write-side duplicate prevention" and is separate from consumer-side duplicate reads. For end-to-end Exactly-Once, be ready to distinguish using transactions or Kafka Streams EOS mode in your answers.
CCDAK
問題 1
You want to prevent retry duplicates within the same partition on a Kafka producer. Which combination of settings meets the requirement?
正解: A
Idempotent Producer requires enable.idempotence=true and must align with acks=all. For ordering and consistency, max.in.flight.requests.per.connection must stay within the constraint (generally 5 or less). B has no idempotence, so duplicates can occur. C conflicts because of acks=1. D is also wrong because transactions are built on idempotence internally, and transactional.id alone does not satisfy the requirement.
Is enable.idempotence enabled by default?
In many Kafka clients the default is false. If duplicate prevention is a requirement, set it to true explicitly and align related settings such as acks=all. Defaults can vary across clients and versions, so check the official documentation when adopting it.
Does Idempotent Producer also prevent duplicate reads on the consumer side?
No. Idempotent Producer prevents duplicate acceptance on the write side. Consumer-side duplicates must be handled via offset commit strategies or transactions (read_committed). For end-to-end Exactly-Once, consider transactions or Kafka Streams EOS.
Should max.in.flight.requests.per.connection be pinned to 1?
Pinning to 1 is sometimes chosen to enforce strict ordering under At-least-once without idempotence, but with Idempotent Producer you can preserve ordering and consistency at up to 5. Since 1 sacrifices significant throughput, tune between 1 and 5 based on requirements.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...