The biggest reasons to adopt Protobuf on Kafka are its compact wire format, strong typing, and ease of schema evolution. The prerequisite, however, is handling field numbers correctly.
This article assumes a typical implementation using Confluent Schema Registry, and walks through safe operation of field numbers and compatibility along the CCDAK exam scope with concrete examples.
Confluent's Protobuf serializer prepends a magic byte and schema ID to the message payload and integrates with Schema Registry to make schema evolution safe. Consumers fetch the schema by ID and interpret the byte stream.
On the wire, Protobuf identifies fields by number. Because the numbers carry meaning, reusing or reassigning them breaks backward compatibility. Understand that the substance (number and type) matters more than the appearance (field name).
Relationship between Kafka, Protobuf, and Schema Registry
Protobuf field numbers range from 1 to 536,870,911, with 19000-19999 reserved internally and unavailable. Numbers are the core of wire compatibility, and the iron rule is: once released, never change and never reuse them.
Numbers differ in encoding cost. Tags 1-15 are shorter on the wire, so assigning them to high-frequency or near-required fields is efficient. Reserve number ranges by logical block or team in advance, and operations stay stable as the schema grows.
.proto example: reservation and addition pattern
syntax = "proto3";
package com.example;
message OrderV1 {
// High-frequency fields get small numbers
int64 order_id = 1;
string customer_id = 2;
// Future status code
int32 status_code = 10;
}
// After evolution
message OrderV2 {
int64 order_id = 1;
string customer_id = 2;
// We want to replace status_code with an enum -> add with a new number
// Old number 10 is deprecated -> reserve it so it cannot be reused
reserved 10; // old: status_code
// You can also reserve the name
reserved "status_code";
// New enum-based status. Old clients ignore it as unknown
OrderStatus status = 20;
// When removing a field: [deprecated = true] -> later move to reserved
string note = 30 [deprecated = true];
}
enum OrderStatus {
ORDER_STATUS_UNSPECIFIED = 0; // 0 is the default
ORDER_STATUS_PLACED = 1;
ORDER_STATUS_SHIPPED = 2;
ORDER_STATUS_CANCELLED = 3;
}
Schema Registry has three main compatibility modes: Backward, Forward, and Full. In Protobuf, adding a new field (with a new number) is backward-compatible in most cases. Older consumers ignore unknown fields, so existing reads continue to work.
Breaking changes include renumbering or reusing numbers, type changes, moving an existing field into a oneof, and changing enum numbers. Renaming may be valid on the wire, but it tends to break code generation and validation, so in practice treat it as breaking.
| Mode | Allowed change examples | Representative failure cases |
|---|---|---|
| Backward | Add a new optional-equivalent field with a new number; update doc comments on existing fields | Renumbering or reusing existing field numbers; changing scalar types (e.g., int32 -> string) |
| Forward | Deprecate and effectively remove a field that is no longer used (new consumers can handle it) | Removing information that old messages effectively require (new -> old becomes unreadable) |
| Full | Minor renames and comment tweaks; adding a completely independent new field | Renumbering or reusing numbers; moving an existing field into a oneof; changing enum numbers |
Drive deletions and replacements in stages. Removing a field outright breaks compilation in older apps or fails compatibility checks. The safe order is: deprecate flag -> stop usage -> mark reserved.
Reserve both the number and the name to prevent accidental reuse later. Manage schema history alongside Schema Registry and bake compatibility checks into PR reviews to cut down on incidents.
Standard pattern when deleting in .proto
message CustomerV2 {
int64 id = 1;
string email = 2;
// Old phone is retired -> deleted and reserved
reserved 5; // forbid number reuse
reserved "phone"; // forbid name reuse
}
Adding a field to a oneof leaves older clients treating that branch as unknown, which can fail to honor the intended semantics. Even if Schema Registry validation passes, it tends to be breaking at the application level.
A map is internally converted into a repeated message. Changing key or value types is breaking. For enums, adding values is safe, but changing or reassigning value numbers is breaking. Renames may be allowed on the wire, but avoid them as a rule for the sake of generated code and readability.
Subject Naming Strategy directly drives your reuse strategy. Pick among TopicNameStrategy (default, <topic>-value), RecordNameStrategy (share by message type), and TopicRecordNameStrategy (topic x type) depending on requirements.
Use io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer and KafkaProtobufDeserializer respectively. To have the deserializer return a generated class, set specific.protobuf.value.type.
Compatibility level can be set globally or per subject. A safe baseline is Backward in dev and Full in production, blocking breaking changes operationally.
Example Schema Registry and client configuration
# Producer/Consumer properties (excerpt)
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer
schema.registry.url=http://localhost:8081
# When you want a specific type back (example)
specific.protobuf.value.type=com.example.OrderV2
# When you want to change Subject Naming Strategy (example)
value.subject.name.strategy=io.confluent.kafka.serializers.subject.RecordNameStrategy
# Compatibility mode (per-subject)
# Specify Backward/Forward/Full, etc.
curl -X PUT -H "Content-Type: application/json" \
--data '{"compatibility": "FULL"}' \
http://localhost:8081/config/my-orders-value
CCDAK
問題 1
An order event's Protobuf schema currently uses a numeric status_code (number 10). Going forward, you want to introduce an enum-typed status to make meaning explicit, with existing consumers migrating in phases. Which response is safest?
正解: A
The safest path is to add the enum field with a new number and protect the old field by deprecating it and then marking it reserved. Reusing numbers or changing types is breaking, and disabling compatibility checks is not a recommended operational practice.
Is it OK to leave gaps between field numbers?
Yes, it is fine. Reserving headroom for future extensions or team-by-team splits is a sound practice. In particular, since numbers 1-15 are scarce, prioritize allocating those and leave the next block open for future use.
Can I change just the field type while keeping the same field number?
No. The field number is the field's identity, so changing its type is a breaking change. Add a new field with a new number and gradually retire the old one.
Can I use Protobuf without a Schema Registry?
Technically yes, but you lose automatic compatibility checks and centralized schema distribution, which drives operational overhead through the roof. For CCDAK and production use, design with Schema Registry as a prerequisite.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Kafka Topics & Partitions: Distribution Fundamentals (2026)
How Kafka topics and partitions enable scale — ordering guar...
CCDAK Exam Guide: Confluent Certified Developer (2026)
Complete prep for the CCDAK exam — Producer/Consumer API, St...
CCAAK Exam Guide: Confluent Certified Administrator (2026)
Pass the CCAAK exam — cluster management, partitions, securi...
Kafka Replicas & ISR: Fault Tolerance Explained (2026)
Replica placement, in-sync replicas (ISR), leader election. ...
Kafka Offsets: Commit Modes & Consumer Position (2026)
Offset semantics — auto vs. manual commit, __consumer_offset...