Kafka

Kafka Broker Sizing Guide: CPU, Memory, and Disk Design

2026-04-19
NicheeLab Editorial Team

Kafka broker performance is determined by the balance of CPU, memory (JVM heap and OS page cache), disk, and network. Rather than chasing specific numbers, identifying the primary resource bottleneck from your workload (message size, compression, replication, consumer count, whether compaction is enabled) leads to more stable sizing.

This article walks through design and validation steps aligned with the core concepts that frequently appear on CCAAK (Confluent's administrator certification): page-cache priority, sequential I/O, acks and min.insync.replicas, and partitions and parallelism.

Workload Assumptions and Throughput Model

Start by deciding how many bytes per second you ingest and emit. Let B_in be producer input throughput (MB/s), RF the replication factor, and CG the number of consumer groups consuming concurrently. Network inbound is roughly B_in, inter-broker replication outbound is B_in × (RF − 1), and client-facing outbound is B_in × CG. Writes are sequential and absorbed by the OS page cache, so I/O does not immediately become random (heavy compaction or re-reads will increase disk load).

Sizing is about identifying which resource saturates first. Heavy compression or TLS pushes CPU to the limit, while heavy compaction or many readers stresses disk bandwidth and page cache. CCAAK frequently tests two facts: Kafka relies on the OS page cache to write logs sequentially, and replication amplifies network and disk load.

The table below maps typical workloads to their primary bottlenecks. Calibrate with actual measurements in practice.

  • Important: raising RF increases both network and disk load linearly.
  • When multiple consumer groups tail concurrently, broker outbound bandwidth approaches B_in × the number of groups.
  • For small messages, CPU and thread-switching costs tend to dominate over raw throughput.
WorkloadPrimary bottleneckCPU strategyMemory strategy
Low-latency writes (larger batches)Network I/O and sequential writesModerate core count, threads matched to NIC queuesPrioritize page cache (modest heap)
Heavy compression/TLS (smaller messages)CPU (compression and encryption)More vCPUs; prefer LZ4/SnappyReserve headroom for GC stability
Heavy log compactionDisk bandwidth and IOPS, background cleanerMid to high (cleaner compression load)Give the page cache more memory
Many consumer groups with heavy readsOutbound network and page cacheModerate (deserialization load)Prioritize page cache

Rough formula for an initial broker count estimate

Input throughput B_in_MBps
Replication factor RF
Concurrent consumer groups CG
Allowed inbound bandwidth per broker C_in_MBps
Allowed outbound bandwidth per broker C_out_MBps
Allowed disk bandwidth per broker C_disk_MBps

Network inbound load   ≈ B_in_MBps
Network outbound load  ≈ B_in_MBps × (RF - 1 + CG)
Disk write load        ≈ B_in_MBps × RF   (sequential I/O; compaction etc. can push it higher)

Calculate N for each resource and take the max:
N_in   = ceil( B_in_MBps                  / C_in_MBps )
N_out  = ceil( B_in_MBps × (RF - 1 + CG)  / C_out_MBps )
N_disk = ceil( B_in_MBps × RF             / C_disk_MBps )
N = max(N_in, N_out, N_disk) × headroom factor (e.g. 1.3-1.5)

Note: refresh C_* with measurements and iterate.

CPU Sizing Principles

CPU is consumed primarily by compression, encryption, serialization/deserialization, and request handling. With compression.type=producer on the topic, the broker generally stores the received compressed batch as-is without recompressing, which reduces CPU load (recompression can still happen when down-conversion is required for compatibility).

Thread counts can be tuned via num.network.threads, num.io.threads, and num.replica.fetchers, but oversubscribing threads beyond the core count or NIC/disk parallelism inflates context switches and backfires. Heavier compression makes CPU the bottleneck faster, so prefer low-overhead codecs like LZ4 or Snappy, and raise parallelism via partition count when needed.

  • Exam tip: compression.type=producer avoids broker-side recompression, which is friendlier to CPU.
  • For small messages, protocol overhead dominates relatively; batching improves CPU efficiency.
  • Down-conversion and enabling transactions add extra CPU and disk overhead.

Memory (Heap and OS Page Cache)

Kafka does not hold log data on the JVM heap; it leans heavily on the OS page cache to smooth out disk I/O. The JVM heap is used for metadata, the control path, network buffers, and similar items. Oversize it and GC pauses stretch out; undersize it and you hit metadata or buffer shortages. The rule of thumb: keep the heap modest and give the rest to the page cache.

A larger page cache means tailing consumer reads stay out of disk and latency holds. For workloads with heavy compaction or rescans, dedicating even more memory to the page cache pays off.

  • Exam tip: Kafka's performance depends heavily on the OS page cache.
  • Set the heap to a moderate size prioritizing GC stability; give off-heap and the page cache more room.
  • Increasing partition count grows metadata volume and adds heap pressure.

Broker data path and the role of memory

ProducerNetwork ThreadsPage Cache (OS)DiskHeapReq/MetaConsumerFetch/ReadBroker data path and the role of memory

Disk and Log Layout

Kafka logs are written sequentially and asynchronously flushed from the OS page cache. With multiple disks, the common pattern is to enumerate a directory per mount in log.dirs and run JBOD-style. Since replication provides fault tolerance, RAID redundancy inside a single broker is not required. Prioritize sequential write bandwidth over I/O consistency.

Log segment size (log.segment.bytes) and retention policy (retention.ms/bytes) directly drive disk usage and cleaner workload. Heavy compaction calls for fast SSD/NVMe, ample bandwidth, and a properly tuned log.cleaner.threads.

  • Exam tip: Kafka assumes sequential I/O and the page cache. JBOD is recommended over RAID in most contexts (replication is assumed).
  • Assign multiple log.dirs to separate physical disks; a single partition lives in one directory.
  • Tighter retention saves disk space but can increase background compaction and delete I/O.

Replication and Availability Trade-offs

Raising RF increases per-write network output (to followers) and disk writes. High-availability configurations (acks=all combined with min.insync.replicas) push up the cost in capacity and throughput. The exam frequently asks about the relationship between acks=all and min.insync.replicas, and the conditions under which writes are refused when the ISR shrinks.

Enabling rack-aware placement strengthens resilience to a single-rack failure, but inter-broker traffic now crosses racks, so east-west network bandwidth needs attention too.

  • With RF=3, acks=all, and min.insync.replicas=2, writes continue even if one broker fails (as long as the ISR stays at 2 or more).
  • Raising RF linearly increases N_out and N_disk, so reflect this in your broker-count estimate.
  • Rack-aware placement spreads a partition's replicas across different racks.

Capacity Planning Procedure and Validation

Plan around the cycle of hypothesize → validate → refine. Collect not just averages but peaks and packet distributions (message size, whether batching is used), estimate post-scaling traffic including RF and CG, run load tests on a small number of nodes to measure the actual single-broker limits (C_in, C_out, C_disk), and recompute the required broker count with about 20-40% headroom.

In operation, monitor BytesInPerSec/BytesOutPerSec, RequestHandlerAvgIdlePercent, NetworkProcessorAvgIdlePercent, UnderReplicatedPartitions, IsrShrinksPerSec, LogFlush/Fetch/Produce latencies, disk usage and I/O wait, and GC metrics so you can spot bottlenecks early and decide when to scale out.

  • Aggregate workload characteristics (B_in, average and 95p message size, RF, CG, whether compaction is used).
  • Run producer/consumer load tests on a small cluster to measure C_in, C_out, and C_disk.
  • Calculate broker count from the formula, then add headroom for future growth and maintenance.

Check Your Understanding

CCAAK

問題 1

Which statement best describes Kafka broker disk design? Assume replication is used to provide high availability.

  1. Because Kafka relies on the OS page cache and uses sequential I/O extensively, prefer JBOD (multiple physical disks enumerated in log.dirs) over RAID redundancy inside a single broker, and rely on replication for availability.
  2. Random I/O dominates, so RAID10 is required for every workload to maintain consistency.
  3. The page cache is barely used, so maximizing the heap to reduce disk dependence is the right call.
  4. Raising the replication factor barely changes network or disk load.

正解: A

Per the official design principles, Kafka assumes sequential I/O and the OS page cache, with fault tolerance provided by replication. JBOD (assigning multiple log.dirs to separate physical disks) is the common pattern, and raising RF increases network and disk load linearly.

Frequently Asked Questions

Should I size based on average load or peak load?

Size the broker count based on peak load and add headroom for future growth, maintenance windows, and temporary skew during incidents. Sizing for the average leaves you saturated during spikes, which hurts latency and shrinks the ISR.

Which is recommended, JBOD or RAID?

Because Kafka relies on replication, JBOD (assigning multiple independent disks via log.dirs) is more common than RAID redundancy inside a single broker. The goal is sequential write bandwidth; fault tolerance comes from RF, the ISR, and rack-aware placement.

Are there guidelines for the JVM heap size?

Keep the heap modest, prioritizing GC stability, and leave the remaining memory to the OS page cache. Oversizing the heap does not help because log data does not live on the heap, and it can actually make GC pauses worse.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.