Kafka

Kafka Quotas and Bandwidth Control: Designing and Implementing Per-Client Limits

2026-04-19
NicheeLab Editorial Team

When you run Kafka across multiple tenants or business units, heavy traffic from one client can easily starve the others. Kafka quotas exist to tame this resource contention and hand each client a predictable share of throughput.

This article walks through how to design, configure, and observe quotas for stable cluster operations, drawing on the official documentation and going deep enough to cover what the CCAAK certification asks about.

Kafka Quota Basics: What You Limit and How

There are three primary Kafka client quotas: producer_byte_rate (producer write bytes per second), consumer_byte_rate (consumer fetch bytes per second), and request_percentage (a proportional share of server processing capacity). When a limit is exceeded the broker does not hard-reject; instead it throttles by delaying responses, and the client learns about the delay from throttle_time_ms in the response.

The limit is smoothed over an observation window so that short bursts are tolerated while the average rate is enforced (controlled on the broker side by quota.window.num and quota.window.size.seconds). Entities can be specified as user, client-id, user+client-id (combination), or default (cluster-wide). The most specific entity wins, and default is applied last.

  • Target metrics: producer_byte_rate, consumer_byte_rate, request_percentage
  • Control mechanism: soft throttling via response delay (throttle_time_ms)
  • Smoothing: averages across an observation window, absorbing momentary bursts
  • Scope of application: user, client-id, user+client-id, default (wildcard)
  • Precedence: more specific configurations win (user+client-id > user > client-id > default)
EntityScopeTypical useSample key
userAuthenticated user (SASL Principal)Total cap per department/teamusers:alice
client-idPer application/processAllocate across apps owned by the same userclients:etl-writer
user+client-idUser x app combinationTighten only one specific app within a userusers:alice,clients:etl-writer
defaultDefault shared by all clientsInitial cap for new or unclassified clientsusers:* or clients:*

Throttling flow (conceptual)

Produce/Fetch ⇄ response (delay)Produce/Fetch ⇄ response (delay)Client A (id=A)user=aliceClient B (id=B)user=bobBrokerQuota Manager (per-entity) / window avg / throttle

Representative quota keys (for clients)

# クライアント向けに設定可能なダイナミック Quota キー(抜粋)
# - producer_byte_rate: 1秒あたりの送信バイト上限
# - consumer_byte_rate: 1秒あたりの受信(フェッチ)バイト上限
# - request_percentage: サーバ処理能力に対する割合シェア(相対配分)
# いずれも超過時は応答に遅延を入れて調整されます。

Entities and Precedence: user, client-id, combinations, default

Quotas can be set at multiple levels simultaneously. At evaluation time the most specific match wins, and if nothing matches the broker falls back to a more general setting. For example, if a user=alice, client-id=etl combination is configured it will be used; otherwise it falls back to user=alice, then client-id=etl, and finally to default.

Using default lets you keep unregistered clients from running wild while selectively raising the cap for critical apps — a layered approach to enforcement.

  • Precedence (high → low): user+client-id > user > client-id > default
  • Default is your safety net. Set it properly first, then loosen on a case-by-case basis.
  • Baking a client.id naming convention into operations makes observability and allocation much easier.
Design patternBenefitCaveats
Strict default + per-entity exceptionsSafe even for unregistered clientsOperational overhead grows as exceptions accumulate
Allocate per userEasy to budget by departmentHard to differentiate between apps owned by the same user
Combine user+client-idFine-grained controlEntries pile up; periodic review is required

Precedence in pseudocode (match order)

# 擬似コード
if quota.exists(user=u, client=c): use that
elif quota.exists(user=u):        use that
elif quota.exists(client=c):      use that
else:                             use default

Configuration and Verification: kafka-configs.sh and the Admin API

Client quotas are applied to the cluster as dynamic configurations. Use kafka-configs.sh (with --bootstrap-server) or the Admin API. Changes take effect almost immediately and require no broker restart.

After configuring, verify with describe and use alter/remove as needed. The changes are stored as metadata in the cluster and take effect across all brokers.

  • Units are bytes/second. Values are evaluated as average rates.
  • Configurations apply cluster-wide (not to specific brokers).
  • Undo with --delete-config or by deleting the entity.
OperationCLI exampleKey point
Setkafka-configs.sh --alter --add-config ...Separate multiple keys with commas
Inspectkafka-configs.sh --describe --entity-type ...Be explicit about the target entity
Deletekafka-configs.sh --alter --delete-config ...When unset, falls back to a higher level or default

kafka-configs.sh examples (official-docs syntax)

# client-id 単位
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
  --alter --add-config 'producer_byte_rate=1048576,consumer_byte_rate=1048576' \
  --entity-type clients --entity-name etl-writer

# user 単位(SASL Principal 名)
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
  --alter --add-config 'producer_byte_rate=2097152' \
  --entity-type users --entity-name alice

# user+client-id(組み合わせ)
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
  --alter --add-config 'consumer_byte_rate=524288' \
  --entity-type users --entity-name alice \
  --entity-type clients --entity-name etl-writer

# describe で確認
bin/kafka-configs.sh --bootstrap-server localhost:9092 \
  --describe --entity-type clients --entity-name etl-writer

Sizing the Limits: Working Backward From Effective Throughput

Set limits by accounting for expected message size, QPS, compression ratio, and overhead such as headers. Start by allocating around 80% of expected demand, then adjust step by step while watching throttle_time_ms, processing delay, and latency.

Consumer and producer sides can be asymmetric. For example, with heavy writes but reads concentrated in only a few groups, it is reasonable to keep consumer_byte_rate on the lower side.

  • Rough math: limit [bytes/s] ≈ average record size x QPS x safety factor
  • Short observation windows oscillate; design alongside smoothing settings.
  • Use request_percentage alongside, matched to your peak-latency SLO.
Design itemGuidelineNotes
producer_byte_rateStart at ~80% of write peakRaise if throttling is frequent
consumer_byte_rateMatch downstream processing capacityIf lag builds up, consider tightening or loosening
request_percentageRelative share under contentionAffects the allocation of processing slots, not just bandwidth

Calculation example

# 平均 10 KB のレコードを 1500 rps で送る場合(圧縮後同程度と仮定)
# 10 * 1024 * 1500 ≈ 15,360,000 bytes/s ≈ 14.6 MiB/s
# 初期上限: 12–13 MiB/s 程度(観測しながら調整)
# → producer_byte_rate ≈ 13 * 1024 * 1024 = 13631488

Observability and Troubleshooting: throttle_time and Metrics

Clients detect throttling via throttle_time_ms in response headers. Continuously non-zero values strongly suggest the limit is being hit. On the broker side, per-request throttle times and rates are exposed as metrics (for example, ThrottleTimeMs per request type).

Excessive throttling causes producer buffers to fill up and latency to spike, while consumers see longer fetch intervals leading to app-side delay and growing lag. Start by checking the target entity's configuration and whether default is sweeping it up.

  • Many client implementations emit warnings about throttling or waits in their logs.
  • Use describe to confirm which level (user / client-id / default) is actually in effect.
  • Short observation windows produce false positives; tune smoothing as needed.
SymptomLikely causeWhat to check
Rising throttle_time_msQuota exceededLimit and precedence for the target entity
Growing consumer lagconsumer_byte_rate is set too lowAlignment with the group's processing capacity
P99 latency regressionrequest_percentage too low or contendedRelative allocation under contention

JMX/metric examples (names vary by environment)

# 例: リクエスト種別ごとのスロットル
# kafka.network:type=RequestMetrics,name=ThrottleTimeMs,request=Produce
# kafka.network:type=RequestMetrics,name=ThrottleTimeMs,request=FetchConsumer
# ダッシュボードで時系列を可視化し、対象エンティティの設定と突き合わせる

Operational Patterns and Caveats: Multi-Tenancy and Replication Differences

In multi-tenant setups, lock down default and only relax it for critical workloads or nightly batches. Standardize a client.id naming convention (such as team-app-purpose) and review the inventory each quarter to prevent rot.

Throttling for replication traffic lives in a different configuration domain than client quotas. It is used during operations like partition reassignment and should not be conflated with controlling application clients.

  • Set default first to keep unexpected new clients in check.
  • For critical workloads, grant the minimum necessary relaxation at user+client-id level.
  • Treat replication throttling settings as a separate concept (used to curb network usage during reassignment).
  • CCAAK angle: nail down the precedence and key names, the fact that throttling is implemented as delay, and the concept of observation windows.
TargetControl mechanismCommon confusion
Application clientsproducer/consumer_byte_rate, request_percentageUser/client precedence
ReplicationDedicated replication throttle settingsNot a client quota
Unregistered clientsStrict default configurationExceptions must be reviewed periodically

Operational inventory (pseudocode procedure)

# 1) 実利用の client.id と Principal を収集(メトリクス/ログ)
# 2) default で保護されているか確認
# 3) 重要系だけ user+client-id で緩和
# 4) 90日以上未使用のエントリは削除候補に

Check Your Understanding

CCAAK

問題 1

In a Kafka cluster, you want to suppress unclassified new clients while granting only user=analytics with client-id=nightly-batch a high write bandwidth. Which configuration is most appropriate?

  1. Set a low producer_byte_rate at default and configure a higher producer_byte_rate specifically for the user=analytics, client-id=nightly-batch combination.
  2. Set a high producer_byte_rate on client-id=nightly-batch and do nothing else.
  3. Set a high producer_byte_rate on user=analytics and do nothing else.
  4. Set the same request_percentage for all clients.

正解: A

Suppressing unclassified clients is what default is for. To relax limits only for a specific user x client pair, lean on the rule that the more specific entity (user+client-id) wins. Setting only client-id or only user could ripple to other clients or other users under those scopes.

Frequently Asked Questions

Does a quota reject requests or just delay them?

Kafka client quotas are essentially soft throttling. When the limit is exceeded the broker delays the response, and the client receives throttle_time_ms back.

Can I tell which entity-level quota is actually in effect?

Use kafka-configs.sh --describe to inspect configurations and watch whether the client-side throttle_time_ms increases. Precedence is applied in the order user+client-id > user > client-id > default.

Can I control replication throttling with client quotas as well?

No. Replication throttling lives in a separate configuration domain. The client-facing producer/consumer_byte_rate and request_percentage do not control it.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.