Avro Schemas in Kafka: Evolution & Defaults (2026)

If you use Avro with Kafka, getting compatibility modes and default values wrong will bite you. This article maps field additions, removals, type changes, and aliases (renames) onto Schema Registry compatibility rules.

CCDAK frequently tests the difference between Backward/Forward/Full and their Transitive variants, when defaults are required, and the ordering of unions that include null. Cover the design rules you need on the job and the exam takeaways together.

Compatibility Modes: What They Mean and How to Pick One

Avro compatibility is defined by the resolution rules used when a reader schema interprets data written with a writer schema. Whenever you register a new schema, Confluent Schema Registry validates it against existing versions using the selected compatibility mode.

In practice, Backward compatibility is the most common choice — it matches the workflow of upgrading consumers before producers. Pick Forward when producers need to roll out first, or Full when you need bidirectional safety. Transitive variants tighten validation so that every past version, not just the previous one, is checked.

Backward: guarantees the new reader schema can read data written by old writers
Forward: guarantees old reader schemas can read data written by the new writer
Full: satisfies both Backward and Forward at the same time
Transitive variants: validate against every past version, not just the previous one
Compatibility is configured per subject (for example, <topic>-value)

Compatibility mode	Impact on existing consumers	Impact on existing producers	Typical allowed changes
BACKWARD	Consumers using the new schema can still read old data	Existing producers keep working unchanged	Field additions (default required), type promotions (e.g. int→long), and renames via aliases
FORWARD	Old consumers can read data written with the new schema	Tighter constraints on the new producer	Field removal (provided the old reader has a default) and a limited set of forward-safe edits
FULL	Readable in both directions, old ⇔ new	Strict constraints on both old and new sides	Additions need defaults, type changes must be promotions only, and renames require aliases
NONE	No validation — breaking changes pass through	Not recommended outside of short experiments	—
TRANSITIVE variants	Guarantee compatibility against every past version	Ensures the release chain never breaks at any point	Available as BACKWARD_TRANSITIVE / FORWARD_TRANSITIVE / FULL_TRANSITIVE

Reading and writing Avro through Schema Registry

Adding Fields: Default Value Design

The most common change is adding a field. To preserve Backward compatibility, every added field must declare a default value. Since old data has no value for the new field, the reader falls back to that default.

To express "optional," include null in the union, put null first in the union array, and set the default to null. Per the Avro spec, the default of a union must match the first type listed in the array.

Defaults for complex types are heavily constrained — record and array defaults must be an empty collection or a fully spec-compliant value. When you can, make the field nullable and default it to null, which is much easier to maintain.

Always attach a default to added fields (Backward compatibility)
Optional fields use ["null", "type"] order with "default": null
Do not default numeric fields to zero out of convenience — missing and zero usually mean different things in the business
For larger structures, start nullable with a null default and consider making them required later

Pitfalls of Removing or Tightening Fields

Removing a field is dangerous from a Forward-compatibility standpoint. Once the new writer stops emitting the field, old readers fail to resolve unless their reader schema already had a default. In practice you cannot patch the old reader schema after the fact, so avoid straight deletions.

The safe play is gradual deprecation: first switch the field to nullable with default null, upgrade every consumer, and only then consider removing it in a future major version. The same applies to tightening — any design that cannot fall back to nullable is a breaking change.

Before deleting: switch to nullable + default null → update every consumer → monitor → delete later
Forward compatibility breaks the moment an old consumer expects a field that has no default
Making a field required is a last resort — consider whether input validation can do the job instead

Type Changes, Renames, and Aliases

Only type promotion is safe. The classic chain is int→long→float→double. Avro does not auto-promote between string and bytes, so use a union that holds both if you need to evolve in that direction.

For renames, use aliases. When you change a field or record name, listing the old name in aliases lets the resolver map the old field onto the new one — a Backward-compatible rename.

Adding a new enum symbol is Backward compatible (new readers can read old data) but unsafe for Forward compatibility (old readers cannot interpret the unknown symbol).

Safe type promotions: int→long, long→double, float→double
Dangerous: boolean→int, swapping string and bytes, or reducing decimal precision
For renames, add aliases: ["old_name"] on the new schema
Enum additions are Backward OK and Forward NG; removals flip the direction and are equally risky

Record Evolution by Example

A typical evolution combines an added field (with a default), a type promotion, and a rename via aliases. Set compatibility to Backward and roll out consumers before producers.

Below is a v1→v2 evolution example. v2 preserves Backward compatibility by making the added field nullable with default null, promoting the numeric type, and using aliases for the rename.

The Schema Registry subject is typically <topic>-value
Configure compatibility per subject and treat it as immutable once you go live
Declare aliases on the new schema to absorb the old name

Avro schema evolution and compatibility settings

# v1: 初期スキーマ
{
  "type": "record",
  "name": "Order",
  "namespace": "nl.shop",
  "fields": [
    {"name": "id", "type": "string"},
    {"name": "amount", "type": "int"},
    {"name": "status", "type": {"type": "enum", "name": "OrderStatus", "symbols": ["NEW", "PAID"]}}
  ]
}

# v2: 追加・昇格・リネーム (Backward 互換)
{
  "type": "record",
  "name": "Order",
  "namespace": "nl.shop",
  "fields": [
    {"name": "order_id", "type": "string", "aliases": ["id"]},
    {"name": "amount", "type": "long"},
    {"name": "status", "type": {"type": "enum", "name": "OrderStatus", "symbols": ["NEW", "PAID", "CANCELLED"]}},
    {"name": "note", "type": ["null", "string"], "default": null}
  ]
}

# ポイント
# - amount: int->long は型昇格
# - order_id: フィールド名の変更だが、aliases で旧名 id を吸収
# - note: ["null", "string"] の順で default は null
# - enum 追加(CANCELLED): Backward は OK。Forward(旧リーダー)は新シンボルを解せない点に注意

# Schema Registry: subject の互換性を Backward に設定 (例: <topic>-value)
# API パスは実環境に合わせる
curl -X PUT \
  -H "Content-Type: application/json" \
  --data '{"compatibility": "BACKWARD"}' \
  http://schema-registry:8081/config/my-orders-value

Serializer and Registry Settings in Production

Confluent's Avro serializer/deserializer embeds a schema ID in each message and works with the Registry to fetch the writer schema. With SpecificRecord, the reader schema comes from the generated class; with GenericRecord, the deserializer interprets data using the writer schema as-is.

In production it is safer to turn off auto.register.schemas and handle registration and review through a separate pipeline. TopicNameStrategy is the typical subject naming strategy, but consider RecordNameStrategy when records are heavily reused across topics.

schema.registry.url is mandatory
Default to auto.register.schemas=false and pre-register with compatibility checks in CI
value.subject.name.strategy: TopicNameStrategy / RecordNameStrategy / TopicRecordNameStrategy
Set specific.avro.reader=true to use SpecificRecord as the reader schema
Do not casually enable use.latest.version — always run compatibility and consistency checks first

CCDAK Checklist

On the exam, scoring well comes down to memorising the definitions in matched sets. Questions typically describe a concrete schema change and ask which compatibility modes allow it and what default it requires.

Backward: field additions require a default; the union default must match the first type
Forward: field removal tends to fail unless the old reader already has a default
Full: must satisfy both directions at once — the strictest mode
Only type promotion is safe (int→long→double); boolean and string substitutions are not allowed
For renames, add aliases on the new schema
Enum additions are Backward OK and Forward NG

Check Your Understanding

CCDAK

問題 1

The value-side subject is configured with BACKWARD compatibility. You want to add a new optional field note to the existing Order record. Which change is the safest?

Add note as type string with no default
Add note as type ["null", "string"] with default null
Add note as type ["string", "null"] with default ""
Add note as type ["null", {"type":"record",...}] with default {}

正解: B

Under Backward compatibility, any added field must have a default. To make it optional, the union must start with null and the default must be null as well — so ["null", "string"] with default=null is the correct choice.

Frequently Asked Questions

Which branch of a union should the default value match?

The Avro spec requires the union default to match the first type in the array. For optional fields, order the union as ["null", "type"] and set the default to null.

How far can you safely change a field's type?

Only type promotion is safe — typical examples are int→long, long→double, and float→double. Changes like string↔bytes or boolean→int are not allowed, and you should avoid logical-type regressions such as reducing decimal precision.

How do enum symbol additions and removals affect compatibility?

Adding a symbol is Backward compatible (new readers can read old data) but breaks Forward compatibility (old readers cannot interpret the new symbol). Removal is the opposite direction and even riskier, so it should generally be avoided.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Avro Schema Design: Field Additions, Defaults, Compatibility, and CCDAK Prep