Kafka

Kafka Offsets and Commits: Position Management and at-least-once Fundamentals

2026-04-19
NicheeLab Editorial Team

A Kafka "offset" is not just a number — it represents a consumer group's "next position to read." When you commit determines whether you risk loss or duplication.

This article follows the CCDAK exam focus areas (offset semantics, commit methods, delivery semantics, and rebalance impact) and is grounded in stable, officially documented behavior.

Offset and Position Basics

Kafka offsets are sequential numbers per partition, and a consumer group stores the "next offset to read" for each partition in __consumer_offsets (an internal, compacted topic). Critically, what's recorded is "the offset after the last processed record."

On startup, the consumer fetches the latest commit from the group coordinator and resumes reading from that position. If no commit exists, the starting position is governed by auto.offset.reset (earliest or latest). Distinguish between the fetch position (next to be retrieved) and the committed position (the restart point).

  • Offsets are tracked per partition, not per topic
  • Commits store the "next to read" position (e.g., last processed = 4 → commit = 5)
  • On startup and rebalance, the consumer resumes from the committed position
  • __consumer_offsets is kept mostly to the latest entries via log compaction

A visual map of offsets and commit positions

Topic: payments, Partition: 00123456789offsetlast processed = 4committed (next) = 5 (group G1)nextToFetch = 7in-flight (fetched, not committed)On restart, resume from committed = 5. A failure while in-flight is uncommitted can replay 5 and 6 (at-least-once).

Comparing Commit Strategies and How to Choose

Kafka consumer commits can be automatic (enable.auto.commit) or manual (commitSync/commitAsync). The choice changes the likelihood of loss and duplication, as well as throughput and latency.

The fundamental rule for at-least-once is "commit only after the side effect (e.g., the external write) completes." Committing before processing yields at-most-once (with possible loss), and the longer you delay commits, the more duplicate reprocessing you accept on failure.

  • Commit after processing = prevents loss (accepts duplicates)
  • Commit before processing = low latency but risks loss (a wrong-answer pattern on the exam)
  • Commit per batch to balance overhead and reprocessing volume
StrategyTypical config / APILoss / duplication riskLatency / overhead
Auto commitenable.auto.commit=true, auto.commit.interval.msFailure after processing → possible loss. Duplication depends on timingLow (simple, but coarse control)
Manual sync commitenable.auto.commit=false, commitSync()Minimizes loss. Duplicates on failure-restartHigher (waits for ack)
Manual async commitenable.auto.commit=false, commitAsync()Async commits can be lost; duplicates possibleLower (throughput-focused)
Batched sync commitcommitSync() every N records or intervalDuplicates are batch-bounded. Loss minimizedModerate (balanced)
Transactional integrationProducer sendOffsetsToTransaction + read_committedAtomically ties read-write-commit together. Suppresses duplicates and lossHigher (more complex design)

Implementation Patterns for at-least-once

The basic loop is "poll → process (durable write to the external system) → commit." Treat batches as units and tune so heavy processing doesn't exceed max.poll.interval.ms. Apply backpressure with pause/resume when needed.

The more idempotent the downstream system (same-key upserts, dedup), the more you absorb the duplicate impact of at-least-once. If you truly need exactly-once, consider transactions (producer EOS with read_committed).

  • Use enable.auto.commit=false and commit explicitly
  • commitSync after processing succeeds; mix in commitAsync when throughput demands it
  • On exception, leave uncommitted to reprocess on the next loop (route to a DLQ if needed)
  • Tune the balance between batch size (max.poll.records) and processing time

Java consumer (manual commits for at-least-once)

// Required library: org.apache.kafka:kafka-clients
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, "etl-payments-g1");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "500");

try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props)) {
    consumer.subscribe(Collections.singletonList("payments"));
    while (true) {
        ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(1000));
        // Preserve per-partition ordering while processing
        for (TopicPartition tp : records.partitions()) {
            List<ConsumerRecord<String, String>> tpRecords = records.records(tp);
            for (ConsumerRecord<String, String> r : tpRecords) {
                // 1) Durable write to the external system (idempotent / replayable)
                writeToDbIdempotent(r.key(), r.value());
            }
            // 2) Sync-commit lastOffset+1 for each partition
            long lastOffset = tpRecords.get(tpRecords.size() - 1).offset();
            Map<TopicPartition, OffsetAndMetadata> commit = new HashMap<>();
            commit.put(tp, new OffsetAndMetadata(lastOffset + 1));
            try {
                consumer.commitSync(commit);
            } catch (CommitFailedException e) {
                // E.g., just after a rebalance. Reprocess on the next loop (at-least-once)
                log.warn("commit failed, will retry by reprocessing", e);
            }
        }
    }
}

// Tip: under overload, combine consumer.pause(assignments) / resume to avoid exceeding max.poll.interval.ms

Rebalance Pitfalls and How to Address Them

Rebalances occur on member join/leave or subscription changes. If processed offsets aren't committed before partitions are revoked, the inheriting consumer reprocesses the gap as duplicates (acceptable under at-least-once, but the volume of duplicates grows).

The standard pattern is to register a ConsumerRebalanceListener and commit immediately in onPartitionsRevoked. The Cooperative Sticky assignor still requires the same care on each incremental revoke. Idle polls or over-long processing that exceed max.poll.interval.ms tend to force rebalances, so use pause/resume and batch tuning to avoid them.

  • Commit the latest fully-processed position in onPartitionsRevoked
  • For long processing, pause → write externally → resume
  • session.timeout.ms and heartbeat.interval.ms affect group stability
  • Commit too often → more overhead; commit too rarely → more duplicate reprocessing

Offset Reset and Reprocessing Fundamentals

When no commit exists, the start position is governed by auto.offset.reset=earliest/latest. If the position goes out of range (e.g., the offset expired past the retention window), explicitly seek with seekToBeginning/seekToEnd or reset the group's offsets with the admin tooling.

When you need to reprocess, seek a specific partition back to an earlier offset after assignment, or read "as a brand new consumer group" under a different group.id. In either case, assume duplicates under at-least-once and keep downstream writes idempotent for safety.

  • Control temporary rewinds with assign/seek (after assignment)
  • For broad redo, consider kafka-consumer-groups --reset-offsets
  • auto.offset.reset applies only when no commit exists
  • Limit the affected partitions to localize the impact of reprocessing

CCDAK Exam Tips and Common Misconceptions

What's committed is the "next position to read," not the offset of the last processed record itself. With that in mind, you should be able to explain the restart position and duplication patterns under failure.

Committing offsets across multiple partitions in one shot does not atomically bind to writes against an external system. If a failure lands between the external side effect and the commit, duplicates can occur. To target exactly-once, choose a design that combines transactions (sendOffsetsToTransaction with read_committed).

  • at-most-once = commit before processing, risk of loss
  • at-least-once = commit after processing, duplicates accepted
  • exactly-once = designed around EOS (transactions) and idempotency
  • Exceeding max.poll.interval.ms → rebalance → more duplicates from uncommitted work
  • __consumer_offsets is an internal topic (don't update it directly)

Check Your Understanding

CCDAK

問題 1

A Kafka consumer performs durable writes to an external database and must preserve at-least-once. Which approach is most appropriate?

  1. Set enable.auto.commit=true and shorten auto.commit.interval.ms
  2. Set enable.auto.commit=false and call commitSync() after processing completes
  3. Call commitAsync() before processing to minimize latency, then write
  4. Setting max.poll.records=1 prevents duplicates

正解: B

Synchronous commit after processing (the durable external write) completes is the foundation of at-least-once, so B is correct. A and C commit before processing and risk loss; D doesn't fundamentally prevent duplicates.

Frequently Asked Questions

Is the committed value the "last processed offset" or the "next offset to read"?

It's the "next offset to read." For example, if the last processed offset was 4, you commit 5. On restart, the consumer resumes from 5 and offsets 0-4 are not reprocessed.

How can I completely eliminate duplicates?

Manual commits on the consumer side alone cannot prevent duplicates. To deduplicate, make the downstream system idempotent (upserts on the same key, unique constraints) or combine Kafka transactions (producer EOS plus sendOffsetsToTransaction, with the consumer using read_committed).

How can I rewind offsets to reprocess historical data?

After assignment, seek a specific partition back to an earlier offset, or use the admin tool (kafka-consumer-groups --reset-offsets) to adjust the group's offsets. For broad reprocessing, using a fresh group.id is also a common approach.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.