Protobuf with Kafka & Schema Registry (2026)

The biggest reasons to adopt Protobuf on Kafka are its compact wire format, strong typing, and ease of schema evolution. The prerequisite, however, is handling field numbers correctly.

This article assumes a typical implementation using Confluent Schema Registry, and walks through safe operation of field numbers and compatibility along the CCDAK exam scope with concrete examples.

Why Use Protobuf on Kafka, and Wire-Format Fundamentals

Confluent's Protobuf serializer prepends a magic byte and schema ID to the message payload and integrates with Schema Registry to make schema evolution safe. Consumers fetch the schema by ID and interpret the byte stream.

On the wire, Protobuf identifies fields by number. Because the numbers carry meaning, reusing or reassigning them breaks backward compatibility. Understand that the substance (number and type) matters more than the appearance (field name).

Use Schema Registry compatibility checks to catch breaking changes before they hit production
Unknown fields are ignored by older clients in Protobuf, so additions can be made safely with proper design
The wire format is compact; assigning small numbers to frequently used fields improves encoding efficiency

Relationship between Kafka, Protobuf, and Schema Registry

Design Principles for Field Numbers

Protobuf field numbers range from 1 to 536,870,911, with 19000-19999 reserved internally and unavailable. Numbers are the core of wire compatibility, and the iron rule is: once released, never change and never reuse them.

Numbers differ in encoding cost. Tags 1-15 are shorter on the wire, so assigning them to high-frequency or near-required fields is efficient. Reserve number ranges by logical block or team in advance, and operations stay stable as the schema grows.

Never reuse a number once released (mark it reserved when deleting)
Prioritize numbers 1-15 for high-frequency fields
Reserve unused numbers up front for future extensions (e.g., 100-199 for extensions)
Do not use 19000-19999 (reserved by Protobuf)
Express type changes or moves into a oneof as new fields with new numbers; deprecate the old number, then reserve it

.proto example: reservation and addition pattern

syntax = "proto3";
package com.example;

message OrderV1 {
  // High-frequency fields get small numbers
  int64 order_id = 1;
  string customer_id = 2;
  // Future status code
  int32 status_code = 10;
}

// After evolution
message OrderV2 {
  int64 order_id = 1;
  string customer_id = 2;
  // We want to replace status_code with an enum -> add with a new number
  // Old number 10 is deprecated -> reserve it so it cannot be reused
  reserved 10; // old: status_code
  // You can also reserve the name
  reserved "status_code";

  // New enum-based status. Old clients ignore it as unknown
  OrderStatus status = 20;

  // When removing a field: [deprecated = true] -> later move to reserved
  string note = 30 [deprecated = true];
}

enum OrderStatus {
  ORDER_STATUS_UNSPECIFIED = 0; // 0 is the default
  ORDER_STATUS_PLACED = 1;
  ORDER_STATUS_SHIPPED = 2;
  ORDER_STATUS_CANCELLED = 3;
}

Compatibility Modes and Safe Evolution Patterns

Schema Registry has three main compatibility modes: Backward, Forward, and Full. In Protobuf, adding a new field (with a new number) is backward-compatible in most cases. Older consumers ignore unknown fields, so existing reads continue to work.

Breaking changes include renumbering or reusing numbers, type changes, moving an existing field into a oneof, and changing enum numbers. Renaming may be valid on the wire, but it tends to break code generation and validation, so in practice treat it as breaking.

Add with a new number; keep existing numbers as they are
Delete in stages: deprecate -> stop using -> mark reserved
When changing a type, introduce a new field with a new number and run them in parallel for a transition period
For enums, adding values is allowed; changing numbers is not

Mode	Allowed change examples	Representative failure cases
Backward	Add a new optional-equivalent field with a new number; update doc comments on existing fields	Renumbering or reusing existing field numbers; changing scalar types (e.g., int32 -> string)
Forward	Deprecate and effectively remove a field that is no longer used (new consumers can handle it)	Removing information that old messages effectively require (new -> old becomes unreadable)
Full	Minor renames and comment tweaks; adding a completely independent new field	Renumbering or reusing numbers; moving an existing field into a oneof; changing enum numbers

Operational Procedure for Deletion, Reuse, and Rename

Drive deletions and replacements in stages. Removing a field outright breaks compilation in older apps or fails compatibility checks. The safe order is: deprecate flag -> stop usage -> mark reserved.

Reserve both the number and the name to prevent accidental reuse later. Manage schema history alongside Schema Registry and bake compatibility checks into PR reviews to cut down on incidents.

Step 1: mark the target field as deprecated and announce its retirement
Step 2: wait until producers have fully migrated to the new field (parallel-run period)
Step 3: remove the old field from .proto and register both number and name as reserved
Step 4: pin compatibility on Schema Registry to Full (or your org standard) to block breaking changes

Standard pattern when deleting in .proto

message CustomerV2 {
  int64 id = 1;
  string email = 2;
  // Old phone is retired -> deleted and reserved
  reserved 5;           // forbid number reuse
  reserved "phone";     // forbid name reuse
}

Compatibility Pitfalls for oneof, map, and enum

Adding a field to a oneof leaves older clients treating that branch as unknown, which can fail to honor the intended semantics. Even if Schema Registry validation passes, it tends to be breaking at the application level.

A map is internally converted into a repeated message. Changing key or value types is breaking. For enums, adding values is safe, but changing or reassigning value numbers is breaking. Renames may be allowed on the wire, but avoid them as a rule for the sake of generated code and readability.

oneof: avoid moving an existing field into a oneof (breaking)
map: do not change key types, and do not change value types; replace with a new field if needed
enum: adding values is allowed, changing or reusing numbers is not; when deleting, treat the number as reserved-equivalent and forbid reuse
Use renames as a last resort; even when wire-compatible, they can break toolchain compatibility

Operational Settings and CCDAK Checkpoints

Subject Naming Strategy directly drives your reuse strategy. Pick among TopicNameStrategy (default, <topic>-value), RecordNameStrategy (share by message type), and TopicRecordNameStrategy (topic x type) depending on requirements.

Use io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer and KafkaProtobufDeserializer respectively. To have the deserializer return a generated class, set specific.protobuf.value.type.

Compatibility level can be set globally or per subject. A safe baseline is Backward in dev and Full in production, blocking breaking changes operationally.

CCDAK check: be able to answer instantly on the meaning of each compatibility mode and examples of allowed/disallowed changes
Be able to choose a Subject Naming Strategy by case (schema reuse vs. separation)
Be able to explain Producer/Consumer Protobuf SerDe configuration and Schema Registry integration

Example Schema Registry and client configuration

# Producer/Consumer properties (excerpt)
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer
schema.registry.url=http://localhost:8081
# When you want a specific type back (example)
specific.protobuf.value.type=com.example.OrderV2
# When you want to change Subject Naming Strategy (example)
value.subject.name.strategy=io.confluent.kafka.serializers.subject.RecordNameStrategy

# Compatibility mode (per-subject)
# Specify Backward/Forward/Full, etc.
curl -X PUT -H "Content-Type: application/json" \
  --data '{"compatibility": "FULL"}' \
  http://localhost:8081/config/my-orders-value

Check with a Sample Question

CCDAK

問題 1

An order event's Protobuf schema currently uses a numeric status_code (number 10). Going forward, you want to introduce an enum-typed status to make meaning explicit, with existing consumers migrating in phases. Which response is safest?

Add status with a new number, mark old status_code's number 10 and name as reserved, and keep compatibility mode at Full
Keep number 10 as is and only change its type from int32 to enum
Rename the old field to status and swap its type later
Temporarily disable compatibility checks and then reuse number 10 for the new purpose

正解: A

The safest path is to add the enum field with a new number and protect the old field by deprecating it and then marking it reserved. Reusing numbers or changing types is breaking, and disabling compatibility checks is not a recommended operational practice.

Frequently Asked Questions

Is it OK to leave gaps between field numbers?

Yes, it is fine. Reserving headroom for future extensions or team-by-team splits is a sound practice. In particular, since numbers 1-15 are scarce, prioritize allocating those and leave the next block open for future use.

Can I change just the field type while keeping the same field number?

No. The field number is the field's identity, so changing its type is a breaking change. Add a new field with a new number and gradually retire the old one.

Can I use Protobuf without a Schema Registry?

Technically yes, but you lose automatic compatibility checks and centralized schema distribution, which drives operational overhead through the roof. For CCDAK and production use, design with Schema Registry as a prerequisite.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka x Protobuf Schema Design: Field Numbers and Compatibility in Practice