Schema References: Composing Avro/Protobuf Schemas (2026)

Schema references are a Schema Registry feature for reusing a common type definition across multiple schemas. They eliminate duplicate definitions and make the blast radius of any change explicit.

References are pinned to a specific subject and version. Because they do not auto-follow latest, you can plan migrations deliberately.

Schema Reference Basics: Why Use Them

Schema References let you call another schema already registered in Schema Registry by name and reuse it. They support Avro, Protobuf, and JSON Schema, allowing you to centrally manage cross-cutting types such as a shared Address type or a common error format.

References are registered against a root (parent) schema by binding a reference name, target subject, and target version. At runtime, only the root schema ID is written into the Kafka message, and the deserializer resolves the references from the Registry to fetch the complete schema.

The benefits: less duplicated type definition, localized changes, easier review, and centralized compatibility checks. The downsides: managing registration order (common types first, then root) and the need to explicitly update reference versions.

Supported formats: Avro / Protobuf / JSON Schema
Messages carry only the root Schema ID (reference IDs are not embedded)
References are recorded as "subject + pinned version" and do not auto-update to latest

Registration and Resolution Flow: How References Work

Registration is simple. 1) First, register the common type (e.g., Address) as its own subject. 2) Then register the root schema that uses it (e.g., Order), specifying the common type's subject name and version in references. Compatibility checks run against the fully expanded schema after resolution.

The meaning of a reference's name varies by format. As a general guide: Avro uses the fully qualified type name, Protobuf uses the import name (.proto), and JSON Schema typically matches the identifier name or path used in $ref.

At runtime, both producer and consumer use only the root Schema ID, and Schema Registry resolves references behind the scenes. The client wire format does not change based on whether references are used.

Registration order: referenced (common type) first, then referencing (root)
Specify name/subject/version in references (latest is not allowed)
Resolution happens server-side in the Registry; the client only performs a normal ID lookup

Reference resolution flow (conceptual diagram)

Example: registering an Avro schema with references (Schema Registry REST API)

## 1) 共通型 Address を登録
curl -s -X POST \
  -H 'Content-Type: application/vnd.schemaregistry.v1+json' \
  http://localhost:8081/subjects/com.example.types.Address/versions \
  -d '{
    "schemaType": "AVRO",
    "schema": "{\n  \"type\": \"record\",\n  \"name\": \"Address\",\n  \"namespace\": \"com.example.types\",\n  \"fields\": [\n    {\"name\": \"street\", \"type\": \"string\"},\n    {\"name\": \"city\", \"type\": \"string\"}\n  ]\n}"
  }'

## 2) ルート型 Order を登録（Address を参照）
curl -s -X POST \
  -H 'Content-Type: application/vnd.schemaregistry.v1+json' \
  http://localhost:8081/subjects/com.example.sales.Order/versions \
  -d '{
    "schemaType": "AVRO",
    "references": [
      {
        "name": "com.example.types.Address",
        "subject": "com.example.types.Address",
        "version": 1
      }
    ],
    "schema": "{\n  \"type\": \"record\",\n  \"name\": \"Order\",\n  \"namespace\": \"com.example.sales\",\n  \"fields\": [\n    {\"name\": \"id\", \"type\": \"string\"},\n    {\"name\": \"shippingAddress\", \"type\": \"com.example.types.Address\"}\n  ]\n}"
  }'

# 注意: 上記のサブジェクト名は例です。実運用ではSubject Naming Strategy（例: TopicNameStrategyで orders-value 等）が決定します。参照自体の動作は同じです。

Reuse Design Patterns: Subject Naming Strategy and Extracting Common Types

When reusing a common type across multiple topics, the Subject Naming Strategy you pick matters. The default TopicNameStrategy creates separate <topic>-key / <topic>-value subjects per topic, so the same type ends up registered under different subjects on different topics.

To consolidate a common type into a single subject for reuse, consider RecordNameStrategy (or TopicRecordNameStrategy). RecordNameStrategy uses the fully qualified record name (e.g., com.example.types.Address) as the subject, so the same type stays in one subject across topics.

Because references are pinned to subject@version, publishing a new version of the common type does not affect root schemas until each one explicitly updates its reference. This enables gradual rollouts.

TopicNameStrategy: per-topic isolation (default)
RecordNameStrategy: centralized management by type name (good for common types)
TopicRecordNameStrategy: granularity by both type and topic

Item	Avro	Protobuf	JSON Schema
How references are expressed	Named type resolution (reuse via fully qualified name)	Dependency resolution via import (.proto)	Reference resolution via $ref
Meaning of references.name	Fully qualified type name (e.g., com.example.types.Address)	Import name (e.g., common.proto)	Match the identifier/path used in $ref
Wire format	Root Schema ID only (reference IDs are not embedded)	Same as left	Same as left
Naming strategy best suited to common-type reuse	RecordNameStrategy	RecordNameStrategy or TopicNameStrategy + package management	RecordNameStrategy (effective for value-schema reuse)
Evolution gotchas (representative)	Adding a required field without a default is backward-incompatible	Reusing an existing field number is incompatible	Narrowing a type (e.g., shrinking an enum) tends to be incompatible

Compatibility and Versioning: How Checks Apply with References

Schema Registry compatibility is configured per subject and is checked against the fully resolved schema at registration time. When references are involved, comparison happens with the common type inlined, so breaking changes to a common type get caught when the root is registered.

Compatibility levels include Backward / Forward / Full, each with a Transitive variant. Transitive checks compatibility not just against the immediately previous version but against all versions. If a common type is widely reused, FULL_TRANSITIVE is the safe choice.

Because references are version-pinned, publishing a new version of a common type does not immediately affect existing roots. The standard practice is to update and re-register the referenced version on each root incrementally.

Checks run against the fully expanded schema (references included)
Transitive compatibility helps catch regressions over the long run
Pinned-version references make gradual migrations easier

Operational Practices: Avoiding Breaking Changes and Rolling Migrations

Common-type changes ripple widely, so set Compatibility on the strict side (e.g., FULL_TRANSITIVE) and proceed in order: publish a new version of the common type, validate it, update references on each root, then roll producers out in stages.

The wire format is just the root Schema ID, so the presence or absence of references does not change Kafka message compatibility. The focus is whether registration succeeds and whether deserialization succeeds. Stop compatibility violations at registration, and use schema evolution rules (e.g., providing defaults) to guarantee consumer-side compatibility.

Publish and stabilize the common type first; then each root catches up its reference
Combine compatibility levels with CI to block breaking changes
Automate pre-flight validation against a sandbox Registry

CCDAK Prep Notes: Exam Angles to Lock In

The differences between Subject Naming Strategies and whether they support reuse (TopicNameStrategy / RecordNameStrategy / TopicRecordNameStrategy).

The required elements when registering a reference (name/subject/version) and the fact that there is no "latest" auto-tracking.

Compatibility checks run after reference resolution, and the Transitive setting means comparison against the full history.

Defaults to separate <topic>-key / <topic>-value subjects
RecordNameStrategy is the strong choice for common-type reuse
There is no automatic upgrade of references (explicit re-registration is required)

Check Your Understanding

CCDAK

問題 1

You are reusing a common Address type across multiple topics. You want to add one optional field to Address and share it on every topic without duplicating the definition. Keeping the wire format unchanged and migrating gradually, which approach is best?

Keep each topic on TopicNameStrategy and re-define Address on every topic at registration time
Set the value serializer to RecordNameStrategy, manage Address under a single subject, and re-register each root schema so references point at the new version of that subject
Set Schema Registry compatibility to NONE and change things freely
Omit version on the reference so it auto-follows latest

正解: B

Consolidate the common type into a single subject with RecordNameStrategy and pin references via subject@version. Apply a backward-compatible change (adding an optional field) to Address, publish the new version, then re-register each root schema so its references point at the new version. The wire format is preserved and migration can proceed gradually.

Frequently Asked Questions

Can you reference across different schema types (e.g., from Avro to Protobuf)?

No. References resolve only within the same schemaType: Avro references Avro, Protobuf references Protobuf, and JSON Schema references JSON Schema.

If a referenced schema is updated, are existing producers immediately affected?

No. References are pinned to a specific version. Existing producers and root schemas continue to use the prior version until you explicitly update the reference and re-register.

What happens to deserialization of schemas with references if Schema Registry goes down?

If the client has the required schemas cached locally, work continues; but uncached Schema IDs or reference resolution still need Registry access. A highly available, redundant Registry deployment is recommended.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka Schema Registry Schema References: Reusing Common Types with Version Control