Kafka Producer Basics: Send, Ack, Batch, Compression (2026)

The producer serializes records on the application thread, picks a target partition via the partitioner, buffers them in the RecordAccumulator, and lets a Sender thread batch them and ship them to the broker. Understanding this end-to-end flow and the role of each setting makes both exam prep and real-world tuning much easier.

This article focuses on stable concepts in line with the official documentation. It avoids version-dependent defaults and instead concentrates on the effects of each setting and the trade-offs involved.

End-to-End View of the Send Flow

The producer's main path is: serialization → partitioning → buffering (RecordAccumulator) → sending (Sender thread) → ACK → callback. The application-side send call is typically non-blocking and the actual delivery happens in the background.

Batches are built per topic-partition and are flushed by either size (batch.size) or time (linger.ms). Compression is applied per batch, so batching and compression.type should be tuned together for the best throughput efficiency.

Responses come back according to the acks setting, and on failure the producer retries within the bounds of retries, backoff, and delivery.timeout.ms. Ordering guarantees depend heavily on max.in.flight.requests.per.connection and whether idempotence is enabled.

Serializer: key.serializer and value.serializer convert records to byte arrays
Partitioner: chooses the destination partition from the key (without a key, the load is balanced across partitions)
RecordAccumulator: groups records of the same topic-partition into batches
Sender thread: wraps batches into Produce requests and sends them to brokers
ACK and callback: success or failure is reported via onCompletion

Producer send path (conceptual)

Minimal example (synchronous send to make the behavior visible)

Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
try (KafkaProducer<String, String> producer = new KafkaProducer<>(p)) {
    ProducerRecord<String, String> rec = new ProducerRecord<>("orders", "order-1", "payload");
    RecordMetadata md = producer.send(rec).get(); // synchronous confirmation for learning purposes
    System.out.printf("topic=%s partition=%d offset=%d%n", md.topic(), md.partition(), md.offset());
}

Batching and Throughput Tuning

The producer builds batches per topic-partition. batch.size sets the upper limit per batch in bytes, while linger.ms makes the producer wait for additional records to fatten the batch when there are not enough to send right away. The standard recipe for high throughput is a non-trivial linger.ms, a suitable batch.size, and an enabled compression.type.

Of course, adding wait time increases end-to-end latency. For latency-sensitive workloads, keep linger.ms small and only wait modestly at low rates. Balance against the SLA, network, and CPU cost.

batch.size is per partition, so memory usage scales with the number of partitions being produced to
compression.type can be gzip, snappy, lz4, or zstd. Pick one with a clear understanding of the CPU vs. compression ratio trade-off
buffer.memory caps the total in-flight buffer. When full, send blocks for up to max.block.ms, then throws TimeoutException
Small messages plus a few milliseconds of linger.ms plus compression often yields a large jump in network efficiency

Typical batch-oriented configuration (excerpt)

Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.ByteArraySerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.ByteArraySerializer.class.getName());
// Fatten the batches
p.put(ProducerConfig.BATCH_SIZE_CONFIG, 64 * 1024);       // Start measuring around 64 KiB
p.put(ProducerConfig.LINGER_MS_CONFIG, 5);                // Start near 5 ms and tune to the SLA
p.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");    // A good balance of low latency and compression ratio
// Buffer management
p.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 64L * 1024 * 1024); // Total buffer cap
p.put(ProducerConfig.MAX_BLOCK_MS_CONFIG, 30_000);             // Maximum time send is allowed to block

Reliability and Retry Control (acks, retries, delivery.timeout.ms)

acks determines what counts as a successful response. acks=0 returns immediately, acks=1 waits for the leader write, and acks=all waits for the in-sync replicas to follow along. For durability the standard recipe is acks=all combined with min.insync.replicas on the broker/topic side (the broker returns an error when the ISR drops below the threshold).

retries controls how many times to resend on failure, and retry.backoff.ms is the wait between retries. delivery.timeout.ms is the budget from the first send attempt to final success or failure, covering linger and retry waits. Think of request.timeout.ms as the per-request budget and delivery.timeout.ms as the overall budget that contains them.

When max.in.flight.requests.per.connection is large, retries can cause records to arrive out of order. If ordering matters, enable idempotence or conservatively lower max.in.flight.

High durability: acks=all plus an appropriate min.insync.replicas on the broker/topic side
High throughput: acks=1, while keeping replica count and availability requirements in mind
If success does not arrive within delivery.timeout.ms, a TimeoutException is returned (the budget covers retries)
Ordering guarantee: use enable.idempotence=true. Without it, use max.in.flight=1 to preserve order, at the cost of throughput

acks	Broker response condition	Durability	Latency / throughput tendency
0	Counted as success immediately after send (no response wait)	Low (treated as success even if the record never reaches the broker)	Lowest latency, highest throughput
1	Responds once the write to the leader succeeds	Medium (data can be lost if the leader fails)	Low latency, high throughput
all	Responds after acknowledgement from all ISR members that satisfy min.insync.replicas	High (when the ISR is healthy)	Slightly higher latency, medium to high throughput (mitigated by batching and compression)

Retries and timeouts on failure (key points)

Properties p = new Properties();
// Reliability-focused
p.put(ProducerConfig.ACKS_CONFIG, "all");
// Generous retry count (design the exact number around your SLA and failure modes)
p.put(ProducerConfig.RETRIES_CONFIG, 100);
p.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 200); // Interval that lets the broker recover
// Overall delivery budget (includes linger and retries)
p.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120_000);
// Per-request wait limit (too small leads to spurious timeouts)
p.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30_000);
// Preserving ordering (1 is recommended when idempotence is off)
p.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1);

Partitioning and Key Design

Records with the same key land in the same partition, preserving their order. Without a key, the default partitioner spreads load using a round-robin or sticky strategy. High-cardinality keys distribute evenly, but hot keys concentrate load on a few partitions.

When log compaction is in use, the key is the unit that gets overwritten by the latest value. Design keys with ordering, compaction efficiency, and hot-partition avoidance all in mind. When necessary, implement business-specific distribution rules with a custom partitioner.

Use a key at the granularity where ordering matters (for example, customerId or orderId)
Mitigate hot partitions by adding a suffix to the key for pseudo-sharding, or by rate-limiting
A custom partitioner affects throughput, so roll it out while measuring

Skeleton of a custom partitioner (Java)

public class ModPartitioner implements org.apache.kafka.clients.producer.Partitioner {
  @Override
  public void configure(java.util.Map<String, ?> configs) {}

  @Override
  public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, org.apache.kafka.common.Cluster cluster) {
    int partitions = cluster.partitionCountForTopic(topic);
    if (keyBytes == null || partitions <= 0) return 0;
    // Example: assume a numeric ID and do a light modulo distribution
    int k = java.nio.ByteBuffer.wrap(keyBytes).getInt();
    return Math.floorMod(k, partitions);
  }

  @Override
  public void close() {}
}
// Usage: set the class name on ProducerConfig.PARTITIONER_CLASS_CONFIG

Idempotence and Transactions (Key Points for Exactly-Once)

enable.idempotence=true enables deduplication so that retries do not produce duplicate writes. Internally the producer uses a producer ID and sequence numbers so the broker can ignore duplicates. Enabling it implies compatible settings such as acks=all and a max.in.flight no greater than 5.

Transactions provide consistency across multiple topics and partitions. Keep transactional.id stable and follow the flow: initTransactions → beginTransaction → send... → commitTransaction (abort on failure). On the consumer side, setting read_committed exposes only committed records.

Idempotence: deduplicates on retry and strengthens ordering (limits depend on the implementation and version)
Transactions: atomicity across multiple partitions. Heavier weight, useful when consistency is required
Reusing the same transactional.id is fenced off so that one of the producers is forcibly invalidated

Minimal transactional producer pattern (Java)

Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
p.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "orders-tx-01");
try (KafkaProducer<String, String> producer = new KafkaProducer<>(p)) {
  producer.initTransactions();
  try {
    producer.beginTransaction();
    producer.send(new ProducerRecord<>("orders", "k1", "v1"));
    producer.send(new ProducerRecord<>("audit", "k1", "v1"));
    producer.commitTransaction();
  } catch (Exception e) {
    producer.abortTransaction();
    throw e;
  }
}

Implementation Example and Operational Checklist

In production, monitoring and backpressure handling are critical. Put metrics like record-error-rate, request-latency-avg, outgoing-byte-rate, and bufferpool-wait-time-total on a dashboard so you can spot delivery errors and timeout trends early.

Inject failures (broker shutdown, network delay, shrinking ISR) and verify that your combination of acks, retries, delivery.timeout.ms, and idempotence behaves as expected before rolling the configuration out to production.

Metrics: when record-error-rate rises, start by inspecting broker responses and retry settings
Buffer exhaustion: confirm buffer.memory, max.block.ms, and the produce rate are consistent
Paths that must be ordered: use enable.idempotence=true and keep max.in.flight conservative
SLA hygiene: do not conflate request.timeout.ms with delivery.timeout.ms

Production-grade producer example (with callback)

Properties p = new Properties();
p.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class.getName());
// Balance of reliability and ordering
p.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
p.put(ProducerConfig.ACKS_CONFIG, "all");
p.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 120_000);
p.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 30_000);
p.put(ProducerConfig.LINGER_MS_CONFIG, 5);
p.put(ProducerConfig.BATCH_SIZE_CONFIG, 64 * 1024);
p.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

try (KafkaProducer<String, String> producer = new KafkaProducer<>(p)) {
  Callback cb = (md, ex) -> {
    if (ex != null) {
      System.err.println("send failed: " + ex.getClass().getSimpleName() + " - " + ex.getMessage());
      // Wire up logging, metrics, and DLQ routing here
    } else {
      System.out.printf("ok %s-%d@%d%n", md.topic(), md.partition(), md.offset());
    }
  };
  for (int i = 0; i < 1000; i++) {
    ProducerRecord<String, String> rec = new ProducerRecord<>("orders", "cust-" + (i % 10), "payload-" + i);
    producer.send(rec, cb);
  }
  producer.flush();
}

Check Your Understanding

CCDAK

問題 1

Requirements: succeed only after replicas have followed, retry automatically on failure, avoid duplicates from retries, and preserve ordering within each key. Which producer configuration is the best fit?

acks=all, enable.idempotence=true, max.in.flight.requests.per.connection no greater than 5
acks=1, retries=0, max.in.flight.requests.per.connection=1
acks=0, linger.ms=50, compression.type=zstd
acks=all, retries=0, and increase delivery.timeout.ms

正解: A

Durability requires acks=all. Deduplicating retries and preserving ordering is best handled by enable.idempotence=true, which requires a compatible max.in.flight setting (no greater than 5). B lacks durability and retries, C falls below the durability requirement, and D disables retries, so extending delivery.timeout.ms cannot satisfy the requirements.

Frequently Asked Questions

Does acks=all wait for responses from all replicas?

It waits for responses from the ISR (In-Sync Replica) set, not from every replica. If min.insync.replicas cannot be satisfied, the broker returns an error and the producer receives an exception.

What happens when you increase linger.ms?

Under light load the producer waits longer to grow each batch, which improves compression efficiency and throughput at the cost of higher end-to-end latency. Tune in the range of a few to a few tens of milliseconds while measuring against your SLA and cost targets.

What is the difference between delivery.timeout.ms and request.timeout.ms?

request.timeout.ms is the wait limit for a single request, while delivery.timeout.ms is the overall budget from the first send attempt including retries and queuing. Once delivery.timeout.ms is exceeded the send is marked as failed with a TimeoutException.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka Producer API Basics: Send Flow and Key Configuration Settings