Confluent Schema Registry Skill

Expert knowledge of Confluent Schema Registry for managing Avro, Protobuf, and JSON Schema schemas in Kafka ecosystems.

What I Know

Schema Formats

Avro (Most Popular):

Binary serialization format
Schema evolution support
Smaller message size vs JSON
Self-describing with schema ID in header
Best for: High-throughput applications, data warehousing

Protobuf (Google Protocol Buffers):

Binary serialization
Strong typing with .proto files
Language-agnostic (Java, Python, Go, C++, etc.)
Efficient encoding
Best for: Polyglot environments, gRPC integration

JSON Schema:

Human-readable text format
Easy debugging
Widely supported
Larger message size
Best for: Development, debugging, REST APIs

Compatibility Modes

Mode Producer Can Consumer Can Use Case

BACKWARD Remove fields, add optional fields Read old data with new schema Most common, safe for consumers

FORWARD Add fields, remove optional fields Read new data with old schema Safe for producers

FULL Add/remove optional fields only Bi-directional compatibility Both producers and consumers upgrade independently

NONE Any change Must coordinate upgrades Development only, NOT production

BACKWARD_TRANSITIVE BACKWARD across all versions Read any old data Strictest backward compatibility

FORWARD_TRANSITIVE FORWARD across all versions Read any new data Strictest forward compatibility

FULL_TRANSITIVE FULL across all versions Complete bi-directional Strictest overall

Default: BACKWARD (recommended for production)

Schema Evolution Strategies

Adding Fields:

// V1 { "type": "record", "name": "User", "fields": [ {"name": "id", "type": "long"}, {"name": "name", "type": "string"} ] }

// V2 - BACKWARD compatible (added optional field with default) { "type": "record", "name": "User", "fields": [ {"name": "id", "type": "long"}, {"name": "name", "type": "string"}, {"name": "email", "type": ["null", "string"], "default": null} ] }

Removing Fields (BACKWARD compatible):

// V1 {"name": "address", "type": "string"}

// V2 - Remove field (old consumers will ignore it) // Field removed from schema

Changing Field Types (Breaking Change!):

// ❌ BREAKING - Cannot change string to int {"name": "age", "type": "string"} → {"name": "age", "type": "int"}

// ✅ SAFE - Use union types {"name": "age", "type": ["string", "int"], "default": "unknown"}

When to Use This Skill

Activate me when you need help with:

Schema evolution strategies ("How do I evolve my Avro schema?")
Compatibility mode selection ("Which compatibility mode for production?")
Schema validation ("Validate my Avro schema")
Best practices ("Schema Registry best practices")
Schema registration ("Register Avro schema with Schema Registry")
Debugging schema issues ("Schema compatibility error")
Format comparison ("Avro vs Protobuf vs JSON Schema")

Best Practices

Always Use Compatible Evolution

✅ DO:

Add optional fields with defaults
Remove optional fields
Use union types for flexibility
Test schema changes in staging first

❌ DON'T:

Change field types
Remove required fields
Rename fields (add new + deprecate old)
Use NONE compatibility in production

Schema Naming Conventions

Hierarchical Namespaces:

com.company.domain.EntityName com.acme.ecommerce.Order com.acme.ecommerce.OrderLineItem

Subject Naming (Kafka topics):

<topic-name>-value
For record values
<topic-name>-key
For record keys
Example: orders-value , orders-key

Schema Registry Configuration

Producer (with Avro):

const { Kafka } = require('kafkajs'); const { SchemaRegistry } = require('@kafkajs/confluent-schema-registry');

const registry = new SchemaRegistry({ host: 'https://schema-registry:8081', auth: { username: 'SR_API_KEY', password: 'SR_API_SECRET' } });

// Register schema const schema = { "type": "record", "name": "User", "fields": [ {"name": "id", "type": "long"}, {"name": "name", "type": "string"} ] };

const { id } = await registry.register({ type: SchemaType.AVRO, schema });

// Encode message with schema const payload = await registry.encode(id, { id: 1, name: 'John Doe' });

await producer.send({ topic: 'users', messages: [{ value: payload }] });

Consumer (with Avro):

const consumer = kafka.consumer({ groupId: 'user-processor' });

await consumer.subscribe({ topic: 'users' });

await consumer.run({ eachMessage: async ({ message }) => { // Decode message (schema ID is in header) const decodedMessage = await registry.decode(message.value); console.log(decodedMessage); // { id: 1, name: 'John Doe' } } });

Schema Validation Workflow

Before Registering:

Validate schema syntax (Avro JSON, .proto, JSON Schema)
Check compatibility with existing versions
Test with sample data
Register in dev/staging first
Deploy to production after validation

CLI Validation:

Check compatibility (before registering)

curl -X POST http://localhost:8081/compatibility/subjects/users-value/versions/latest
-H "Content-Type: application/vnd.schemaregistry.v1+json"
-d '{"schema": "{...}"}'

Register schema

curl -X POST http://localhost:8081/subjects/users-value/versions
-H "Content-Type: application/vnd.schemaregistry.v1+json"
-d '{"schema": "{...}"}'

Common Issues & Solutions

Issue 1: Schema Compatibility Error

Error:

Schema being registered is incompatible with an earlier schema

Root Cause: Violates compatibility mode (e.g., removed required field with BACKWARD mode)

Solution:

Check current compatibility mode: curl http://localhost:8081/config/users-value
Fix schema to be compatible OR change mode (carefully!)
Validate before registering: curl -X POST http://localhost:8081/compatibility/subjects/users-value/versions/latest
-d '{"schema": "{...}"}'

Issue 2: Schema Not Found

Error:

Subject 'users-value' not found

Root Cause: Schema not registered yet OR wrong subject name

Solution:

List all subjects: curl http://localhost:8081/subjects
Register schema if missing
Check subject naming convention (<topic>-key or <topic>-value )

Issue 3: Message Deserialization Failed

Error:

Unknown magic byte!

Root Cause: Message not encoded with Schema Registry (missing magic byte + schema ID)

Solution:

Ensure producer uses Schema Registry encoder
Check message format: [magic_byte(1) + schema_id(4) + payload]
Use @kafkajs/confluent-schema-registry library

Schema Evolution Decision Tree

Need to change schema? ├─ Adding new field? │ ├─ Required field? → Add with default value (BACKWARD) │ └─ Optional field? → Add with default null (BACKWARD) │ ├─ Removing field? │ ├─ Required field? → ❌ BREAKING CHANGE (coordinate upgrade) │ └─ Optional field? → ✅ BACKWARD compatible │ ├─ Changing field type? │ ├─ Compatible types (e.g., int → long)? → Use union types │ └─ Incompatible types? → ❌ BREAKING CHANGE (add new field, deprecate old) │ └─ Renaming field? └─ ❌ BREAKING CHANGE → Add new field + mark old as deprecated

Avro vs Protobuf vs JSON Schema Comparison

Feature Avro Protobuf JSON Schema

Encoding Binary Binary Text (JSON)

Message Size Small (90% smaller) Small (80% smaller) Large (baseline)

Human Readable No No Yes

Schema Evolution Excellent Good Fair

Language Support Java, Python, C++ 20+ languages Universal

Performance Very Fast Very Fast Slower

Debugging Harder Harder Easy

Best For Data warehousing, ETL Polyglot, gRPC REST APIs, dev

Recommendation:

Production: Avro (best balance)
Polyglot teams: Protobuf
Development/Debugging: JSON Schema

References

Schema Registry REST API: https://docs.confluent.io/platform/current/schema-registry/develop/api.html
Avro Specification: https://avro.apache.org/docs/current/spec.html
Protobuf Guide: https://developers.google.com/protocol-buffers
JSON Schema Spec: https://json-schema.org/

Invoke me when you need schema management, evolution strategies, or compatibility guidance!

confluent-schema-registry

Safety Notice

Copy this and send it to your AI assistant to learn

Check compatibility (before registering)

Register schema

Source Transparency

Related Skills

technical-writing

spec-driven-brainstorming

kafka-architecture

frontend