Kafka

Binding Kafka Topics to Schemas: A Practical Guide to Preventing Compatibility Breakage

2026-04-19
NicheeLab Editorial Team

Kafka brokers themselves do not validate the structure of messages. In production, the key is to combine Schema Registry, naming strategies, and compatibility levels to enforce "which schema is allowed to land in which topic."

This article focuses on the binding between topics and schemas and explains the design and operations that prevent compatibility breakage. It also covers the Confluent CCDAK staples: Subject naming strategies, compatibility levels, and broker-side validation.

Why Bind Topics to Schemas?

Kafka is a high-throughput log, and brokers know nothing about message structure. Schema validation is the job of the client serializer/deserializer and Schema Registry. If the relationship between topics and schemas is left ambiguous, breaking changes slip through and cause deserialization failures or aggregation-job outages in production.

The core of binding design is to use the Subject naming strategy to decide the unit at which compatibility is managed, enforce compatibility in the Registry, and optionally turn on broker-side validation. On top of that, run compatibility checks in CI/CD so that breaking changes are rejected at registration time.

  • Kafka brokers do not validate structure. Validation is handled by the serializer, the Registry, and (optionally) broker-side validation.
  • The Subject naming strategy decides how compatibility is grouped.
  • A three-layer defense — pre-registration checks, Registry compatibility, and broker-side validation — is the most effective way to prevent compatibility breakage.

End-to-end validation points

register/checktopic/subject bindingserialize (magic byte + id)append/fetch bytesdeserialize with iderrors if incompatibleProducerAvro/JSON/ProtobufSchema RegistrycompatibilityKafka TopicBrokeroptional: schema validationConsumerAlerts/MonitoringDLQ, metrics, logs

Choosing a Subject Naming Strategy

In Schema Registry, the Subject naming strategy — the unit of registration — determines how compatibility is grouped. The default is TopicNameStrategy, which binds the value schema to <topic>-value. When multiple event types share a single topic, RecordNameStrategy or TopicRecordNameStrategy works well in practice.

CCDAK tends to ask which strategy fits which use case, the pitfalls of mixing types, and the available options when different event types share a topic.

  • Single event type per topic: TopicNameStrategy is the clearest and easiest to operate.
  • Multiple event types per topic: use TopicRecordNameStrategy to manage compatibility independently per type.
  • To keep record names stable, standardize Avro namespaces and names across the organization.
StrategySubject UnitBest FitWatch Out For
TopicNameStrategy<topic>-key/valueOne topic = one event type. Producers and consumers clearly handle the same type.Mixing multiple types in the same topic causes compatibility checks to collide easily.
RecordNameStrategy<recordFullName>Type-centric reuse. When you want to share the same record name across different topics.Evolving the same record name across different topics widens the blast radius.
TopicRecordNameStrategy<topic>-<recordFullName>Multiple event types in the same topic, with compatibility managed independently per type.You need consistent record naming, and the number of subjects increases.

Specifying the Subject naming strategy on the client (Java example)

Properties p = new Properties();
p.put("key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
p.put("value.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
p.put("schema.registry.url", "https://registry:8081");
// トピック内に複数イベント型を入れる場合
p.put("value.subject.name.strategy",
      "io.confluent.kafka.serializers.subject.TopicRecordNameStrategy");

Compatibility Levels and Evolution Rules

Schema Registry has both global and per-subject compatibility levels. The common default is BACKWARD, which guarantees you can read old data with the latest schema. In production, BACKWARD_TRANSITIVE (compatibility with every prior version) is usually recommended.

Avro, JSON Schema, and Protobuf differ in their evolution rules, but in practice the common safe playbook is: add new fields with defaults, and avoid deleting required fields or changing types. Even when multiple types share a topic, TopicRecordNameStrategy lets you run compatibility checks independently per type.

  • Backward-compatible and safe: adding fields with defaults, widening to a union (mind compatibility rules).
  • Likely breaking: removing required fields, narrowing types (long → int), changing the meaning of a field.
  • TRANSITIVE demands compatibility against every prior version, catching gaps and making it well-suited to production.
  • The global setting is just a starting point — ultimately, configure compatibility explicitly at the subject level.

Configuring compatibility and running pre-checks (REST API)

# グローバル互換性をBACKWARD_TRANSITIVEに
curl -s -X PUT -H 'Content-Type: application/json' \
  --data '{"compatibility":"BACKWARD_TRANSITIVE"}' \
  https://registry:8081/config

# サブジェクト単位で上書き
curl -s -X PUT -H 'Content-Type: application/json' \
  --data '{"compatibility":"FULL_TRANSITIVE"}' \
  https://registry:8081/config/orders-value

# 登録前に互換性をテスト(v=latestと新スキーマの組合せを検証)
curl -s -X POST -H 'Content-Type: application/json' \
  --data '{"schemaType":"AVRO","schema":"{\\"type\\":\\"record\\",...}"}' \
  https://registry:8081/compatibility/subjects/orders-value/versions/latest

CI/CD and Environment Separation to Prevent Compatibility Breakage

The most common incident is a breaking change slipping past review and getting registered. The fix is to manage schemas as an artifact independent from applications, run a Registry compatibility check on every PR, and block the merge if it fails. Limit registration permissions to a CI robot user.

For environment separation, run a separate Registry per environment and promote schemas DEV → STG → PRD. Assume IDs do not match across environments and judge identity by content (fingerprint).

  • Manage schemas in a dedicated repository. Use tags and release notes to make evolution visible.
  • Call /compatibility on every PR and fail the check on breaking changes.
  • Only CI executes POST to /subjects. Manual registration is forbidden.
  • On the production Registry, allow broad read access but strict writes, and always keep audit logs.

Example CI job (pseudocode): compatibility check → register on pass

# 1) 互換性テスト
curl -f -X POST -H 'Content-Type: application/json' \
  --data @candidate-schema.json \
  https://registry:8081/compatibility/subjects/orders-value/versions/latest

# 2) 合格後に登録
curl -f -X POST -H 'Content-Type: application/json' \
  --data @candidate-schema.json \
  https://registry:8081/subjects/orders-value/versions

Runtime Enforcement: Broker Validation, Connect, and ksqlDB

Confluent Server lets you turn on broker-side schema validation per topic, so the broker rejects messages whose schema ID does not exist in Schema Registry or that violate compatibility. Consider enabling it on critical topics.

In Kafka Connect, configure the Avro/JSON Schema/Protobuf converters with Schema Registry integration and enable schemas.enable. ksqlDB depends on the schema of its source topics, so evaluate in advance whether evolution will impact downstream streams and tables.

  • Broker-side validation: enable confluent.key.schema.validation and confluent.value.schema.validation on the topic.
  • Connect: configure AvroConverter / ProtobufConverter / JsonSchemaConverter together with schema.registry.url.
  • ksqlDB: verify persistent-query compatibility and prepare a migration procedure in advance for any breaking change.

Enabling broker-side schema validation on a topic (Confluent Server)

# 既存トピックに付与
kafka-configs --bootstrap-server broker:9092 \
  --alter --topic orders \
  --add-config confluent.value.schema.validation=true,confluent.key.schema.validation=true

# Connectの例(Avro)
"key.converter":"io.confluent.connect.avro.AvroConverter",
"value.converter":"io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url":"https://registry:8081",
"value.converter.schema.registry.url":"https://registry:8081",
"value.converter.schemas.enable": true

Monitoring and Incident-Response Playbook

Stopping compatibility breakage at registration time is best, but you still need observability for runtime failures. Typical signals are consumer deserialization exceptions, broker rejections, and 409 errors from the Registry. Combine DLQs, metrics, and audit logs to enable early detection and rollback.

Rollback comes down to two choices: revert the schema, or update consumers to handle the new schema. The former re-registers a prior version as latest; the latter limits blast radius via a gradual canary rollout.

  • Visualize Registry 4xx/5xx rates, schema-registration counts, and compatibility-check failure counts.
  • Route consumer deserialization exceptions to a DLQ and alert on the count.
  • Maintain a schema-rollback runbook and rehearse reactivating prior versions.

Check Your Understanding

CCDAK

問題 1

A single topic 'orders' stores multiple event types such as OrderCreated and OrderCancelled. Existing consumers must keep reading with backward compatibility, and breaking changes must be rejected at registration time. Which combination is most appropriate?

  1. Use TopicNameStrategy and set global compatibility to FULL. Broker-side validation is unnecessary.
  2. Use TopicRecordNameStrategy, set each subject's compatibility to BACKWARD_TRANSITIVE, and enable confluent.value.schema.validation on the topic.
  3. Use RecordNameStrategy with FORWARD compatibility, and absorb breaking changes in the consumer.
  4. Use TopicNameStrategy and ban breaking changes via operational rules, relying only on review.

正解: B

When multiple event types share the same topic, TopicRecordNameStrategy is the right choice because it manages compatibility independently per type. BACKWARD_TRANSITIVE is the production-grade compatibility level and rejects breaking changes at Registry registration. On top of that, enabling broker-side schema validation in the Confluent Server topic configuration rejects bad messages at runtime as well.

Frequently Asked Questions

Should I configure global compatibility or per-subject compatibility?

Start with a global default of BACKWARD, but in production configure compatibility explicitly per subject. For critical topics, prefer BACKWARD_TRANSITIVE or FULL_TRANSITIVE. Treat the global setting as a default value and give the final authority to the subject level.

Can I mix Avro and JSON Schema in the same topic?

You should avoid this. The Confluent wire format resolves by schema ID, but the deserializer type is not auto-detected. Mixing formats within one topic complicates consumer code and operations unnecessarily and becomes a source of incidents. Standardize on a single format per topic.

How should I evolve key schemas?

Keys drive partitioning and compaction, so the safest rule is not to evolve them. If you must change a key, manage it under the <topic>-key subject separately and evaluate backward compatibility strictly. Avoid changes that break hash stability (type changes or serialization-format changes); if needed, plan a migration to a new topic.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.