Kafka Streams Windowing: Tumbling, Hopping, Session (2026)

Windowing is the time-axis design decision: which records belong together, and where do you draw the boundary. In Kafka, you almost always reason about this in terms of event time.

The first step is distinguishing Tumbling (fixed-length, non-overlapping), Hopping (fixed-length, overlapping), and Session (variable-length, mergeable) — and picking the right one for the job.

Windowing Basics and Time Semantics

Kafka's windowing slices a continuous stream by time so you can aggregate or join records inside each slice. Both the Kafka Streams DSL and ksqlDB default to event time (the record's timestamp). Stream time advances to the maximum event timestamp observed so far.

To handle late events (records that arrive out of order), you configure how long after the window ends you are still willing to accept them — the Grace period. A window closes when stream-time reaches window-end + grace, and records belonging to that window after that point are discarded as late.

To handle time correctly, configure TimestampExtractor (in Kafka Streams) or the TIMESTAMP column (in ksqlDB) so each record carries the event time you actually want to window on.

Event time is the default. Approximating with processing time is discouraged on the exam and in production.
Stream time advances to the highest timestamp seen. A single record with a high timestamp can close older windows en masse.
Late tolerance is set explicitly via Grace. Too short and you lose accuracy; too long and both latency and memory usage climb.

Tumbling vs. Hopping vs. Session (Comparison Table)

The three window types differ in how their boundaries are formed and whether they overlap. Pick one based on your requirements: whether duplicate counting is acceptable, whether you need to detect user-activity clusters, and the reporting granularity you want. CCDAK repeatedly tests the definitions and the right use case for each.

Session windows are variable-length and split by an inactivity gap. The key fact — both for production and the exam — is that a late event can cause two previously separate sessions to merge.

Tumbling: fixed length, non-overlapping. Ideal for per-minute KPIs and any case where you want exactly one result per window.
Hopping: fixed length, overlapping. Great for sliding analytics (moving averages, rolling counts). Use it knowing that records will be counted multiple times.
Session: variable length, with merging. The strongest choice for capturing bursts of user activity as a single unit.

Type	Boundary / Length	Overlap	Grace (Late Tolerance)
Tumbling	Fixed length. Example: every minute [00:00, 00:01)	None (non-overlapping)	Accepts records up to Grace after window-end
Hopping	Fixed length plus an advance step. Example: 5-minute size with a 1-minute advance	Yes (the same event lands in multiple windows)	Accepts records up to Grace after window-end (per window)
Session	Variable length. Split by an inactivity gap and extended by new activity	Mergeable (late events can join sessions together)	Accepts records up to Grace after the session ends

Late Events, Grace, and Stream Time (Walk-through)

A window stays open until stream-time reaches window-end + Grace. Because stream-time advances to the maximum observed timestamp, a single future-dated record can suddenly close several older windows at once.

The diagram below uses Tumbling windows [0,5) and [5,10) with Grace=2. A late event with ts=4 that arrives while stream-time is 6 is still admitted, because window [0,5) stays open until 5+2=7.

Window close condition: stream-time >= window-end + grace
Late records are admitted only before the window closes. Anything arriving afterward is dropped (observable via metrics).
Records with high timestamps push stream-time forward, so understand your data's timestamp distribution at design time.

Visualizing window-end and Grace (Tumbling, Grace=2)

time --->
0    2    4    6    8   10
|----|----|----|----|----|
[ W1:0-5 )                [ W2:5-10 )
|-------------------------|-------------------------

イベント到着:
 e1: ts=3, arrive@3   -> 属する: W1（オンタイム）
 e2: ts=6, arrive@6   -> 属する: W2（オンタイム）=> stream-time = 6
 e3: ts=4, arrive@6   -> 遅延だが W1 に取り込み可能（stream-time 6 < 5+Grace 7）
 e4: ts=1, arrive@9   -> W1は既に閉鎖（stream-time>=7）=> 遅着として破棄

Implementation Patterns (Kafka Streams DSL and ksqlDB)

In the Kafka Streams DSL, use TimeWindows.ofSizeAndGrace and SessionWindows.ofInactivityGapAndGrace to declare the window size and Grace explicitly. For Hopping, set the advance step with advanceBy. Session windows require a merge function (the Aggregator Merger).

In ksqlDB, the WINDOW clause selects TUMBLING / HOPPING / SESSION, paired with SIZE, ADVANCE BY, and GRACE PERIOD. Add EMIT FINAL when you only want results once the window closes (suppressing intermediate updates). Exact clause placement and support vary by version, so check the documentation for the version you run before going to production.

Kafka Streams: use ofSizeAndGrace / ofInactivityGapAndGrace. Set retention to at least window size + Grace.
Hopping: set the advance step with advanceBy. Plan for duplicate aggregation downstream.
ksqlDB: use the WINDOW clause and GRACE PERIOD, plus EMIT FINAL when you need it.

Sample code for Kafka Streams DSL and ksqlDB

// Kafka Streams (Java)
KStream<String, String> source = builder.stream("input");

// 1) Tumbling 1分 + Grace 5分（イベント時刻ベース）
KTable<Windowed<String>, Long> tumbling = source
    .groupByKey()
    .windowedBy(TimeWindows.ofSizeAndGrace(Duration.ofMinutes(1), Duration.ofMinutes(5)))
    .count();

// 必要ならウィンドウ閉鎖後のみ出力（中間を抑制）
KTable<Windowed<String>, Long> tumblingFinal = tumbling
    .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()));

// 2) Hopping（長さ5分・1分スライド）+ Grace 5分
KTable<Windowed<String>, Long> hopping = source
    .groupByKey()
    .windowedBy(
        TimeWindows.ofSizeAndGrace(Duration.ofMinutes(5), Duration.ofMinutes(5))
                   .advanceBy(Duration.ofMinutes(1))
    )
    .count();

// 3) Session（非活動ギャップ5分・Grace 10分）
KTable<Windowed<String>, Long> sessions = source
    .groupByKey()
    .windowedBy(SessionWindows.ofInactivityGapAndGrace(Duration.ofMinutes(5), Duration.ofMinutes(10)))
    .aggregate(
        () -> 0L,
        (key, value, agg) -> agg + 1,
        (agg1, agg2) -> agg1 + agg2, // セッションマージ時の結合関数
        Materialized.with(Serdes.String(), Serdes.Long())
    );

// ksqlDB
-- ストリーム定義（イベント時刻としてtsを使用）
CREATE STREAM clicks (user_id VARCHAR, ts BIGINT)
  WITH (KAFKA_TOPIC='clicks', VALUE_FORMAT='JSON', TIMESTAMP='ts');

-- Tumbling 1分 + Grace 5分、最終結果のみ
CREATE TABLE clicks_tumbling AS
  SELECT user_id, COUNT(*) AS c
  FROM clicks
  WINDOW TUMBLING (SIZE 1 MINUTE, GRACE PERIOD 5 MINUTES)
  GROUP BY user_id
  EMIT FINAL;

-- Hopping（長さ5分・1分スライド）+ Grace 5分
CREATE TABLE clicks_hopping AS
  SELECT user_id, COUNT(*) AS c
  FROM clicks
  WINDOW HOPPING (SIZE 5 MINUTES, ADVANCE BY 1 MINUTE, GRACE PERIOD 5 MINUTES)
  GROUP BY user_id
  EMIT FINAL;

-- Session（ギャップ60秒・Grace 5分）
CREATE TABLE clicks_session AS
  SELECT user_id, COUNT(*) AS c
  FROM clicks
  WINDOW SESSION (60 SECONDS, GRACE PERIOD 5 MINUTES)
  GROUP BY user_id
  EMIT FINAL;

-- 注: 実運用ではバージョンによりWINDOW/GRACE句の詳細が異なる場合があります。

State Store Retention, Suppression, and Output Strategy

Windowed aggregations keep internal state (the window store). Retention must be at least window size + Grace. When using ofSizeAndGrace or ofInactivityGapAndGrace in Kafka Streams, configuring retention to satisfy that constraint is the standard practice in production.

Suppression buffers intermediate results until the window closes, then emits only the final value downstream. untilWindowCloses is clean conceptually, but the buffer grows and adds memory pressure, so evaluate it against your load and latency requirements. EMIT FINAL plays a similar role in ksqlDB.

Retention should be size + Grace at minimum. Less than that, and late events cannot be accepted.
Suppression / EMIT FINAL trades latency for memory. Use it when you only need the final KPI value.
High-cardinality keys make store size balloon, so plan key design and compacted topics together.

CCDAK Exam Tips and Common Pitfalls

Recurring exam topics include window-type definitions, event time vs. processing time, late tolerance via Grace, and session merge behavior. Memorize two facts in particular: the merge condition for sessions (a late event spanning the inactivity gap joins two sessions) and the window close condition (stream-time >= window-end + grace).

Hopping inherently double-counts, Suppression and EMIT FINAL each have specific use cases, and insufficient retention silently breaks late-event handling. These are misunderstood in production and weaponized as trick questions on the exam.

Do not forget to configure TimestampExtractor / TIMESTAMP. Do not fall back to processing-time approximations.
Session windows recompute their aggregate on merge — a merge function is required.
Window closing is driven by stream-time, not by record arrival time.

Check Your Understanding

CCDAK

問題 1

You want a 1-minute Tumbling aggregation in Kafka Streams that uses event time and accepts late events up to 5 minutes after window-end. Which configuration is most appropriate?

Pass TimeWindows.ofSizeAndGrace(Duration.ofMinutes(1), Duration.ofMinutes(5)) to windowedBy
Use TimeWindows.of(Duration.ofMinutes(1)).advanceBy(Duration.ofMinutes(1))
Configure only Suppressed.untilWindowCloses to allow 5 minutes of lateness
Set Grace to 0 and only extend Materialized#withRetention

正解: A

Late tolerance is controlled by Grace, so ofSizeAndGrace(1 minute, 5 minutes) is correct. advanceBy sets the advance step for Hopping, not late tolerance. Suppression controls output timing — separate from late acceptance. Retention must be large enough, but with Grace set to 0, late events are still rejected.

Frequently Asked Questions

What happens to late events that arrive after Grace expires? Can they still be captured?

Once the window closes (stream-time >= window-end + grace), any records that arrive afterward are dropped (Kafka Streams counts them in a metric). If you need to capture them, implement your own lateness check upstream of the window (for example, compare current stream-time to the record's timestamp) and route late records to a separate topic. ksqlDB does not provide a built-in late-record output.

When do session windows merge, and what gets recomputed?

A merge happens when two sessions for the same key — previously separated by an inactivity gap — get bridged by a late-arriving event. On merge, the start and end timestamps are updated and the aggregate is recomputed via the merge function (the aggregator merger).

Should I use event time or processing time?

Both Kafka Streams and ksqlDB are designed around event time. Configure TimestampExtractor or the TIMESTAMP column properly so windowing uses the time embedded in the data itself. Processing time is fine for latency measurements or rough dashboards, but it is not appropriate for accurate aggregations.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka Windowing: When to Use Tumbling, Hopping, and Session Windows