The producer serializes records on the application thread, picks a target partition via the partitioner, buffers them in the RecordAccumulator, and lets a Sender thread batch them and ship them to the broker. Understanding this end-to-end flow and the role of each setting makes both exam prep and real-world tuning much easier.
This article focuses on stable concepts in line with the official documentation. It avoids version-dependent defaults and instead concentrates on the effects of each setting and the trade-offs involved.
The producer's main path is: serialization → partitioning → buffering (RecordAccumulator) → sending (Sender thread) → ACK → callback. The application-side send call is typically non-blocking and the actual delivery happens in the background.
Batches are built per topic-partition and are flushed by either size (batch.size) or time (linger.ms). Compression is applied per batch, so batching and compression.type should be tuned together for the best throughput efficiency.
Responses come back according to the acks setting, and on failure the producer retries within the bounds of retries, backoff, and delivery.timeout.ms. Ordering guarantees depend heavily on max.in.flight.requests.per.connection and whether idempotence is enabled.
Producer send path (conceptual)
Minimal example (synchronous send to make the behavior visible)
Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
try (KafkaProducer<String, String> producer = new KafkaProducer<>(p)) {
ProducerRecord<String, String> rec = new ProducerRecord<>("orders", "order-1", "payload");
RecordMetadata md = producer.send(rec).get(); // synchronous confirmation for learning purposes
System.out.printf("topic=%s partition=%d offset=%d%n", md.topic(), md.partition(), md.offset());
}
The producer builds batches per topic-partition. batch.size sets the upper limit per batch in bytes, while linger.ms makes the producer wait for additional records to fatten the batch when there are not enough to send right away. The standard recipe for high throughput is a non-trivial linger.ms, a suitable batch.size, and an enabled compression.type.
Of course, adding wait time increases end-to-end latency. For latency-sensitive workloads, keep linger.ms small and only wait modestly at low rates. Balance against the SLA, network, and CPU cost.
Typical batch-oriented configuration (excerpt)
Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.ByteArraySerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.ByteArraySerializer.class.getName());
// Fatten the batches
p.put(ProducerConfig.BATCH_SIZE_CONFIG, 64 * 1024); // Start measuring around 64 KiB
p.put(ProducerConfig.LINGER_MS_CONFIG, 5); // Start near 5 ms and tune to the SLA
p.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4"); // A good balance of low latency and compression ratio
// Buffer management
p.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 64L * 1024 * 1024); // Total buffer cap
p.put(ProducerConfig.MAX_BLOCK_MS_CONFIG, 30_000); // Maximum time send is allowed to block
acks determines what counts as a successful response. acks=0 returns immediately, acks=1 waits for the leader write, and acks=all waits for the in-sync replicas to follow along. For durability the standard recipe is acks=all combined with min.insync.replicas on the broker/topic side (the broker returns an error when the ISR drops below the threshold).
retries controls how many times to resend on failure, and retry.backoff.ms is the wait between retries. delivery.timeout.ms is the budget from the first send attempt to final success or failure, covering linger and retry waits. Think of request.timeout.ms as the per-request budget and delivery.timeout.ms as the overall budget that contains them.
When max.in.flight.requests.per.connection is large, retries can cause records to arrive out of order. If ordering matters, enable idempotence or conservatively lower max.in.flight.
| acks | Broker response condition | Durability | Latency / throughput tendency |
|---|---|---|---|
| 0 | Counted as success immediately after send (no response wait) | Low (treated as success even if the record never reaches the broker) | Lowest latency, highest throughput |
| 1 | Responds once the write to the leader succeeds | Medium (data can be lost if the leader fails) | Low latency, high throughput |
| all | Responds after acknowledgement from all ISR members that satisfy min.insync.replicas | High (when the ISR is healthy) | Slightly higher latency, medium to high throughput (mitigated by batching and compression) |
Retries and timeouts on failure (key points)
Properties p = new Properties();
// Reliability-focused
p.put(ProducerConfig.ACKS_CONFIG, "all");
// Generous retry count (design the exact number around your SLA and failure modes)
p.put(ProducerConfig.RETRIES_CONFIG, 100);
p.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 200); // Interval that lets the broker recover
// Overall delivery budget (includes linger and retries)
p.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120_000);
// Per-request wait limit (too small leads to spurious timeouts)
p.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30_000);
// Preserving ordering (1 is recommended when idempotence is off)
p.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1);
Records with the same key land in the same partition, preserving their order. Without a key, the default partitioner spreads load using a round-robin or sticky strategy. High-cardinality keys distribute evenly, but hot keys concentrate load on a few partitions.
When log compaction is in use, the key is the unit that gets overwritten by the latest value. Design keys with ordering, compaction efficiency, and hot-partition avoidance all in mind. When necessary, implement business-specific distribution rules with a custom partitioner.
Skeleton of a custom partitioner (Java)
public class ModPartitioner implements org.apache.kafka.clients.producer.Partitioner {
@Override
public void configure(java.util.Map<String, ?> configs) {}
@Override
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, org.apache.kafka.common.Cluster cluster) {
int partitions = cluster.partitionCountForTopic(topic);
if (keyBytes == null || partitions <= 0) return 0;
// Example: assume a numeric ID and do a light modulo distribution
int k = java.nio.ByteBuffer.wrap(keyBytes).getInt();
return Math.floorMod(k, partitions);
}
@Override
public void close() {}
}
// Usage: set the class name on ProducerConfig.PARTITIONER_CLASS_CONFIG
enable.idempotence=true enables deduplication so that retries do not produce duplicate writes. Internally the producer uses a producer ID and sequence numbers so the broker can ignore duplicates. Enabling it implies compatible settings such as acks=all and a max.in.flight no greater than 5.
Transactions provide consistency across multiple topics and partitions. Keep transactional.id stable and follow the flow: initTransactions → beginTransaction → send... → commitTransaction (abort on failure). On the consumer side, setting read_committed exposes only committed records.
Minimal transactional producer pattern (Java)
Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
p.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "orders-tx-01");
try (KafkaProducer<String, String> producer = new KafkaProducer<>(p)) {
producer.initTransactions();
try {
producer.beginTransaction();
producer.send(new ProducerRecord<>("orders", "k1", "v1"));
producer.send(new ProducerRecord<>("audit", "k1", "v1"));
producer.commitTransaction();
} catch (Exception e) {
producer.abortTransaction();
throw e;
}
}
In production, monitoring and backpressure handling are critical. Put metrics like record-error-rate, request-latency-avg, outgoing-byte-rate, and bufferpool-wait-time-total on a dashboard so you can spot delivery errors and timeout trends early.
Inject failures (broker shutdown, network delay, shrinking ISR) and verify that your combination of acks, retries, delivery.timeout.ms, and idempotence behaves as expected before rolling the configuration out to production.
Production-grade producer example (with callback)
Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
// Balance of reliability and ordering
p.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
p.put(ProducerConfig.ACKS_CONFIG, "all");
p.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120_000);
p.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30_000);
p.put(ProducerConfig.LINGER_MS_CONFIG, 5);
p.put(ProducerConfig.BATCH_SIZE_CONFIG, 64 * 1024);
p.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
try (KafkaProducer<String, String> producer = new KafkaProducer<>(p)) {
Callback cb = (md, ex) -> {
if (ex != null) {
System.err.println("send failed: " + ex.getClass().getSimpleName() + " - " + ex.getMessage());
// Wire up logging, metrics, and DLQ routing here
} else {
System.out.printf("ok %s-%d@%d%n", md.topic(), md.partition(), md.offset());
}
};
for (int i = 0; i < 1000; i++) {
ProducerRecord<String, String> rec = new ProducerRecord<>("orders", "cust-" + (i % 10), "payload-" + i);
producer.send(rec, cb);
}
producer.flush();
}
CCDAK
問題 1
Requirements: succeed only after replicas have followed, retry automatically on failure, avoid duplicates from retries, and preserve ordering within each key. Which producer configuration is the best fit?
正解: A
Durability requires acks=all. Deduplicating retries and preserving ordering is best handled by enable.idempotence=true, which requires a compatible max.in.flight setting (no greater than 5). B lacks durability and retries, C falls below the durability requirement, and D disables retries, so extending delivery.timeout.ms cannot satisfy the requirements.
Does acks=all wait for responses from all replicas?
It waits for responses from the ISR (In-Sync Replica) set, not from every replica. If min.insync.replicas cannot be satisfied, the broker returns an error and the producer receives an exception.
What happens when you increase linger.ms?
Under light load the producer waits longer to grow each batch, which improves compression efficiency and throughput at the cost of higher end-to-end latency. Tune in the range of a few to a few tens of milliseconds while measuring against your SLA and cost targets.
What is the difference between delivery.timeout.ms and request.timeout.ms?
request.timeout.ms is the wait limit for a single request, while delivery.timeout.ms is the overall budget from the first send attempt including retries and queuing. Once delivery.timeout.ms is exceeded the send is marked as failed with a TimeoutException.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...