Kafka

Kafka Schema Registry Schema References: Reusing Common Types with Version Control

2026-04-19
NicheeLab Editorial Team

Schema references are a Schema Registry feature for reusing a common type definition across multiple schemas. They eliminate duplicate definitions and make the blast radius of any change explicit.

References are pinned to a specific subject and version. Because they do not auto-follow latest, you can plan migrations deliberately.

Schema Reference Basics: Why Use Them

Schema References let you call another schema already registered in Schema Registry by name and reuse it. They support Avro, Protobuf, and JSON Schema, allowing you to centrally manage cross-cutting types such as a shared Address type or a common error format.

References are registered against a root (parent) schema by binding a reference name, target subject, and target version. At runtime, only the root schema ID is written into the Kafka message, and the deserializer resolves the references from the Registry to fetch the complete schema.

The benefits: less duplicated type definition, localized changes, easier review, and centralized compatibility checks. The downsides: managing registration order (common types first, then root) and the need to explicitly update reference versions.

  • Supported formats: Avro / Protobuf / JSON Schema
  • Messages carry only the root Schema ID (reference IDs are not embedded)
  • References are recorded as "subject + pinned version" and do not auto-update to latest

Registration and Resolution Flow: How References Work

Registration is simple. 1) First, register the common type (e.g., Address) as its own subject. 2) Then register the root schema that uses it (e.g., Order), specifying the common type's subject name and version in references. Compatibility checks run against the fully expanded schema after resolution.

The meaning of a reference's name varies by format. As a general guide: Avro uses the fully qualified type name, Protobuf uses the import name (.proto), and JSON Schema typically matches the identifier name or path used in $ref.

At runtime, both producer and consumer use only the root Schema ID, and Schema Registry resolves references behind the scenes. The client wire format does not change based on whether references are used.

  • Registration order: referenced (common type) first, then referencing (root)
  • Specify name/subject/version in references (latest is not allowed)
  • Resolution happens server-side in the Registry; the client only performs a normal ID lookup

Reference resolution flow (conceptual diagram)

writesschema-idfetch/resolveProducerSerializerKafka Topic[magic][schema-id][data]Schema Registryroot subject / references: name→subject@versionConsumerDeserializeruses resolved schema graphExample: Order@v2 (S1) → name: Address → S2@v3

Example: registering an Avro schema with references (Schema Registry REST API)

## 1) 共通型 Address を登録
curl -s -X POST \
  -H 'Content-Type: application/vnd.schemaregistry.v1+json' \
  http://localhost:8081/subjects/com.example.types.Address/versions \
  -d '{
    "schemaType": "AVRO",
    "schema": "{\n  \"type\": \"record\",\n  \"name\": \"Address\",\n  \"namespace\": \"com.example.types\",\n  \"fields\": [\n    {\"name\": \"street\", \"type\": \"string\"},\n    {\"name\": \"city\", \"type\": \"string\"}\n  ]\n}"
  }'

## 2) ルート型 Order を登録(Address を参照)
curl -s -X POST \
  -H 'Content-Type: application/vnd.schemaregistry.v1+json' \
  http://localhost:8081/subjects/com.example.sales.Order/versions \
  -d '{
    "schemaType": "AVRO",
    "references": [
      {
        "name": "com.example.types.Address",
        "subject": "com.example.types.Address",
        "version": 1
      }
    ],
    "schema": "{\n  \"type\": \"record\",\n  \"name\": \"Order\",\n  \"namespace\": \"com.example.sales\",\n  \"fields\": [\n    {\"name\": \"id\", \"type\": \"string\"},\n    {\"name\": \"shippingAddress\", \"type\": \"com.example.types.Address\"}\n  ]\n}"
  }'

# 注意: 上記のサブジェクト名は例です。実運用ではSubject Naming Strategy(例: TopicNameStrategyで orders-value 等)が決定します。参照自体の動作は同じです。

Reuse Design Patterns: Subject Naming Strategy and Extracting Common Types

When reusing a common type across multiple topics, the Subject Naming Strategy you pick matters. The default TopicNameStrategy creates separate <topic>-key / <topic>-value subjects per topic, so the same type ends up registered under different subjects on different topics.

To consolidate a common type into a single subject for reuse, consider RecordNameStrategy (or TopicRecordNameStrategy). RecordNameStrategy uses the fully qualified record name (e.g., com.example.types.Address) as the subject, so the same type stays in one subject across topics.

Because references are pinned to subject@version, publishing a new version of the common type does not affect root schemas until each one explicitly updates its reference. This enables gradual rollouts.

  • TopicNameStrategy: per-topic isolation (default)
  • RecordNameStrategy: centralized management by type name (good for common types)
  • TopicRecordNameStrategy: granularity by both type and topic
ItemAvroProtobufJSON Schema
How references are expressedNamed type resolution (reuse via fully qualified name)Dependency resolution via import (.proto)Reference resolution via $ref
Meaning of references.nameFully qualified type name (e.g., com.example.types.Address)Import name (e.g., common.proto)Match the identifier/path used in $ref
Wire formatRoot Schema ID only (reference IDs are not embedded)Same as leftSame as left
Naming strategy best suited to common-type reuseRecordNameStrategyRecordNameStrategy or TopicNameStrategy + package managementRecordNameStrategy (effective for value-schema reuse)
Evolution gotchas (representative)Adding a required field without a default is backward-incompatibleReusing an existing field number is incompatibleNarrowing a type (e.g., shrinking an enum) tends to be incompatible

Compatibility and Versioning: How Checks Apply with References

Schema Registry compatibility is configured per subject and is checked against the fully resolved schema at registration time. When references are involved, comparison happens with the common type inlined, so breaking changes to a common type get caught when the root is registered.

Compatibility levels include Backward / Forward / Full, each with a Transitive variant. Transitive checks compatibility not just against the immediately previous version but against all versions. If a common type is widely reused, FULL_TRANSITIVE is the safe choice.

Because references are version-pinned, publishing a new version of a common type does not immediately affect existing roots. The standard practice is to update and re-register the referenced version on each root incrementally.

  • Checks run against the fully expanded schema (references included)
  • Transitive compatibility helps catch regressions over the long run
  • Pinned-version references make gradual migrations easier

Operational Practices: Avoiding Breaking Changes and Rolling Migrations

Common-type changes ripple widely, so set Compatibility on the strict side (e.g., FULL_TRANSITIVE) and proceed in order: publish a new version of the common type, validate it, update references on each root, then roll producers out in stages.

The wire format is just the root Schema ID, so the presence or absence of references does not change Kafka message compatibility. The focus is whether registration succeeds and whether deserialization succeeds. Stop compatibility violations at registration, and use schema evolution rules (e.g., providing defaults) to guarantee consumer-side compatibility.

  • Publish and stabilize the common type first; then each root catches up its reference
  • Combine compatibility levels with CI to block breaking changes
  • Automate pre-flight validation against a sandbox Registry

CCDAK Prep Notes: Exam Angles to Lock In

The differences between Subject Naming Strategies and whether they support reuse (TopicNameStrategy / RecordNameStrategy / TopicRecordNameStrategy).

The required elements when registering a reference (name/subject/version) and the fact that there is no "latest" auto-tracking.

Compatibility checks run after reference resolution, and the Transitive setting means comparison against the full history.

  • Defaults to separate <topic>-key / <topic>-value subjects
  • RecordNameStrategy is the strong choice for common-type reuse
  • There is no automatic upgrade of references (explicit re-registration is required)

Check Your Understanding

CCDAK

問題 1

You are reusing a common Address type across multiple topics. You want to add one optional field to Address and share it on every topic without duplicating the definition. Keeping the wire format unchanged and migrating gradually, which approach is best?

  1. Keep each topic on TopicNameStrategy and re-define Address on every topic at registration time
  2. Set the value serializer to RecordNameStrategy, manage Address under a single subject, and re-register each root schema so references point at the new version of that subject
  3. Set Schema Registry compatibility to NONE and change things freely
  4. Omit version on the reference so it auto-follows latest

正解: B

Consolidate the common type into a single subject with RecordNameStrategy and pin references via subject@version. Apply a backward-compatible change (adding an optional field) to Address, publish the new version, then re-register each root schema so its references point at the new version. The wire format is preserved and migration can proceed gradually.

Frequently Asked Questions

Can you reference across different schema types (e.g., from Avro to Protobuf)?

No. References resolve only within the same schemaType: Avro references Avro, Protobuf references Protobuf, and JSON Schema references JSON Schema.

If a referenced schema is updated, are existing producers immediately affected?

No. References are pinned to a specific version. Existing producers and root schemas continue to use the prior version until you explicitly update the reference and re-register.

What happens to deserialization of schemas with references if Schema Registry goes down?

If the client has the required schemas cached locally, work continues; but uncached Schema IDs or reference resolution still need Registry access. A highly available, redundant Registry deployment is recommended.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる
Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.


Related articles
Kafka

Kafka Topics & Partitions: Distribution Fundamentals (2026)

How Kafka topics and partitions enable scale — ordering guar...

Kafka

CCDAK Exam Guide: Confluent Certified Developer (2026)

Complete prep for the CCDAK exam — Producer/Consumer API, St...

Kafka

CCAAK Exam Guide: Confluent Certified Administrator (2026)

Pass the CCAAK exam — cluster management, partitions, securi...

Kafka

Kafka Replicas & ISR: Fault Tolerance Explained (2026)

Replica placement, in-sync replicas (ISR), leader election. ...

Kafka

Kafka Offsets: Commit Modes & Consumer Position (2026)

Offset semantics — auto vs. manual commit, __consumer_offset...

Browse all Kafka articles (101)
© 2026 NicheeLab All rights reserved.