Connect Converters: Avro, JSON, String, Protobuf (2026)

Kafka is fundamentally about writing and reading byte arrays, but in practice the layer that converts those bytes into formats humans and downstream systems can work with is critical. That layer is the client Serializer/Deserializer and the Kafka Connect Converter.

This article focuses on Connect's key.converter and value.converter: how to choose them, how to integrate with Schema Registry, and how to avoid common pitfalls. We also clarify the distinctions CCDAK tends to test.

Converter vs Serializer: Roles and Positioning

Kafka clients (Producer/Consumer) use Serializers and Deserializers to convert in-app objects to and from byte arrays. Kafka Connect, on the other hand, uses Converters (key.converter / value.converter) to handle data and schemas at the boundary between a connector and Kafka.

They differ in implementation, where they are configured, and how schema-aware they are. A Converter translates between Connect's internal data model (Schema + Struct, etc.) and the bytes on Kafka, and integrates cleanly with SMTs (Single Message Transforms), retries, and error handling.

Serializer/Deserializer: operates at the application boundary. Configured through the Kafka client API.
Converter: operates at the Connect worker/connector boundary. Configured independently for key and value.
Schema awareness: many Converter implementations are designed around embedded schemas and Schema Registry integration.

Target	Layer	Typical classes / config example
Converter (Connect)	Kafka Connect connector boundary	org.apache.kafka.connect.storage.StringConverter / io.confluent.connect.avro.AvroConverter (key.converter, value.converter)
Serializer/Deserializer (client)	Producer/Consumer application boundary	org.apache.kafka.common.serialization.StringSerializer / StringDeserializer (key.serializer, value.deserializer)
SerDe (Streams)	Kafka Streams (paired with types)	org.apache.kafka.common.serialization.Serdes.String() (passed to the builder)

Where conversion happens in Kafka (app vs Connect)

Serializer vs Converter configuration snippets, side by side

# Producer (application side)
bootstrap.servers=broker:9092
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer

# Consumer (application side)
bootstrap.servers=broker:9092
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer

auto.offset.reset=earliest

# Kafka Connect (worker / global defaults)
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true
# (When using Schema Registry)
# key.converter=io.confluent.connect.avro.AvroConverter
# value.converter=io.confluent.connect.avro.AvroConverter
# key.converter.schema.registry.url=http://schema-registry:8081
# value.converter.schema.registry.url=http://schema-registry:8081

key.converter and value.converter: Roles and Separate Configuration

Kafka records hold key and value separately, and Connect configures key.converter and value.converter independently. This matters in practice. Partitioning typically hashes the key bytes, so keeping the key representation stable directly affects data placement across the cluster and how records are grouped.

Values, in contrast, tend to undergo schema evolution, so teams typically pick Avro, Protobuf, or JSON Schema together with Schema Registry. A simple String/Long key paired with an Avro/Protobuf value is a classic real-world combination.

Using StringConverter for the key and AvroConverter for the value is a common pairing.
ByteArrayConverter passes bytes through without conversion. Useful when an external system expects raw binary.
JsonConverter emits schema-embedded JSON when schemas.enable=true and plain schemaless JSON when false.

Per-connector override (overrides the worker default)

{
  "name": "jdbc-source-custom",
  "config": {
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "tasks.max": "1",
    "connection.url": "jdbc:postgresql://db:5432/app",
    "mode": "incrementing",
    "incrementing.column.name": "id",
    "topic.prefix": "db_",

    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://schema-registry:8081"
  }
}

Schema Registry Integration and Schema Management (from Connect's perspective)

Confluent's AvroConverter, ProtobufConverter, and JsonSchemaConverter integrate with Schema Registry and serialize Connect's internal data (Schema + Value) per record with the appropriate schema ID. Downstream consumers can then safely reconstruct the data with the matching deserializer, such as KafkaAvroDeserializer.

JsonConverter does not depend on Schema Registry, but when schemas.enable=true the JSON on Kafka contains schema information, which improves interoperability between Connect deployments. Ordinary consumers handling that JSON directly need to understand its structure.

When using Schema Registry, set schema.registry.url on both the key and value converters.
Manage schema evolution policy (BACKWARD, etc.) through Schema Registry's compatibility settings.
On the consumer side, configure the corresponding deserializer (e.g. KafkaAvroDeserializer). For a Connect Sink, the matching Converter restores the data automatically.

Minimal consumer configuration for Avro

bootstrap.servers=broker:9092
group.id=orders-app
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer
schema.registry.url=http://schema-registry:8081

auto.offset.reset=earliest

Real-World Patterns: Picking Formats and Selection Criteria

Choose a format based on observability, interoperability, schema evolution, and performance. The cost of migrating later, while the system is live, also matters.

Below are policies frequently adopted in the field.

Keep keys human-readable and stable (String/Long). Avoid binary keys whenever possible.
Use Avro, Protobuf, or JSON Schema for values. Prefer Avro/Protobuf when you care most about schema evolution and cross-language compatibility.
For short-term observability or a PoC, start quickly with JsonConverter (schemas.enable=true), then migrate to Avro/Protobuf in stages.

Small pipeline example (keys as String, values migrating to JSON Schema)

# Worker defaults
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true

# Later, migrate only the connector for the target topic to JSON Schema
value.converter=io.confluent.connect.json.JsonSchemaConverter
value.converter.schema.registry.url=http://schema-registry:8081

Common Pitfalls and How to Debug Them

Format mismatches and missing schema configuration are the classic culprits. Inspect the bytes safely and check both the Converter side and the Serializer side.

Record keys directly affect partitioning and downstream logic. Changing the key representation changes its hash, so old and new data may end up on different partitions — be careful.

Garbled text: a mismatch such as Producer using StringSerializer while Connect uses ByteArrayConverter. Use kcat with -f to inspect the representation.
Failure when applying an SMT (such as ExtractField) with JsonConverter while schemas.enable=false. Some SMTs require Connect's internal representation to carry a schema.
Missing the Schema Registry URL on one side (key or value). If your design pairs a String key with an Avro value, you still need to configure both converters explicitly.
NULL values (tombstones) appear regardless of format. Design Sink-side logic assuming the Converter/Deserializer can handle NULLs.

Inspecting messages with kcat (String key, Avro value)

# Keys as strings, values decoded as Avro
kcat -b broker:9092 -t orders -C -K: -f 'key=%k\tval=%s\n' \
  -s key=string -s value=avro -r http://schema-registry:8081

CCDAK Exam Checkpoints

The exam frequently tests layer distinctions — which setting goes where. Picture the system diagram and quickly identify which boundary a given action happens at.

Kafka Connect configures key.converter and value.converter separately. Worker defaults can be overridden per connector.
Clients (Producer/Consumer) configure key.serializer/value.serializer and key.deserializer/value.deserializer.
For formats that use Schema Registry (Avro/Protobuf/JSON Schema), set schema.registry.url on the Converter in Connect.
Understand the behavioral difference between schemas.enable=true and false in JsonConverter. true emits schema-embedded JSON, suited for Connect-to-Connect integration.
Key representation affects partitioning and compatibility. Keep your key format stable.

Test Your Understanding

CCDAK

問題 1

On a Kafka Connect Source Connector, you want to write string keys and schema-managed values to Kafka. Which combination of settings is appropriate?

key.converter=StringConverter, value.converter=one of Avro/Protobuf/JSON Schema, plus schema.registry.url on each converter as needed
Set key.serializer=StringSerializer and value.serializer=KafkaAvroSerializer in the Connect worker properties
Configuring the Avro deserializer on the consumer side is enough; no source-side configuration is needed
Setting value.converter=JsonConverter with schemas.enable=false removes the need for Schema Registry and still guarantees schema evolution

正解: A

Connect uses Converters. Use StringConverter for the key and an Avro/Protobuf/JSON Schema converter for the value, and supply schema.registry.url on the converter for formats that use Schema Registry. Serializers are a client (application) concept and cannot substitute for Connect's Converters. JSON with schemas.enable=false provides no guarantee of schema evolution.

Frequently Asked Questions

Do key.converter and value.converter have to be the same class?

No. Typically the key uses StringConverter or LongConverter while the value uses an Avro/Protobuf/JSON Schema converter. Pick them independently based on your requirements.

What is the difference between schemas.enable=true and false in JsonConverter?

true emits JSON with an embedded schema (good for Connect-to-Connect interop and SMTs), while false emits plain JSON without a schema. The latter is easier for ordinary consumers to handle directly, but the application has to manage type information and schema evolution itself.

Where do I configure Schema Registry compatibility?

Set the compatibility level (such as BACKWARD) on the Schema Registry server, either per subject or globally. Converters and serializers honor that setting when registering and validating schemas.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

Kafka Connect key/value.converter vs Serializer: Practical Guide and CCDAK Prep