|
| 1 | +# Confluent Schema Registry Instrumentation |
| 2 | + |
| 3 | +This instrumentation module provides detailed observability for Confluent Schema Registry operations in Kafka applications. |
| 4 | + |
| 5 | +## Features |
| 6 | + |
| 7 | +This instrumentation captures: |
| 8 | + |
| 9 | +### Producer Operations |
| 10 | +- **Schema Registration**: Tracks when schemas are registered with the Schema Registry |
| 11 | + - Subject name |
| 12 | + - Schema ID assigned |
| 13 | + - Success/failure status |
| 14 | + - Compatibility check results |
| 15 | +- **Serialization**: Logs every message serialization with: |
| 16 | + - Topic name |
| 17 | + - Key schema ID (if applicable) |
| 18 | + - Value schema ID |
| 19 | + - Success/failure status |
| 20 | + |
| 21 | +### Consumer Operations |
| 22 | +- **Deserialization**: Tracks every message deserialization with: |
| 23 | + - Topic name |
| 24 | + - Key schema ID (if present in message) |
| 25 | + - Value schema ID (extracted from Confluent wire format) |
| 26 | + - Success/failure status |
| 27 | + |
| 28 | +### Schema Registry Client Operations |
| 29 | +- **Schema Registration** (`register()` method) |
| 30 | + - Successful registrations with schema ID |
| 31 | + - Compatibility failures with error details |
| 32 | +- **Compatibility Checks** (`testCompatibility()` method) |
| 33 | + - Pass/fail status |
| 34 | + - Error messages for incompatible schemas |
| 35 | +- **Schema Retrieval** (`getSchemaById()` method) |
| 36 | + - Schema ID lookups during deserialization |
| 37 | + |
| 38 | +## Metrics Collected |
| 39 | + |
| 40 | +The `SchemaRegistryMetrics` class tracks: |
| 41 | + |
| 42 | +- `schemaRegistrationSuccess` - Count of successful schema registrations |
| 43 | +- `schemaRegistrationFailure` - Count of failed schema registrations (compatibility issues) |
| 44 | +- `schemaCompatibilitySuccess` - Count of successful compatibility checks |
| 45 | +- `schemaCompatibilityFailure` - Count of failed compatibility checks |
| 46 | +- `serializationSuccess` - Count of successful message serializations |
| 47 | +- `serializationFailure` - Count of failed serializations |
| 48 | +- `deserializationSuccess` - Count of successful message deserializations |
| 49 | +- `deserializationFailure` - Count of failed deserializations |
| 50 | + |
| 51 | +## Log Output Examples |
| 52 | + |
| 53 | +### Successful Producer Operation |
| 54 | +``` |
| 55 | +[Schema Registry] Schema registered successfully - Subject: myTopic-value, Schema ID: 123, Is Key: false, Topic: myTopic |
| 56 | +[Schema Registry] Produce to topic 'myTopic', schema for key: none, schema for value: 123, serializing: VALUE |
| 57 | +``` |
| 58 | + |
| 59 | +### Failed Schema Registration (Incompatibility) |
| 60 | +``` |
| 61 | +[Schema Registry] Schema registration FAILED - Subject: myTopic-value, Is Key: false, Topic: myTopic, Error: Schema being registered is incompatible with an earlier schema |
| 62 | +[Schema Registry] Schema compatibility check FAILED - Subject: myTopic-value, Error: Schema being registered is incompatible with an earlier schema |
| 63 | +[Schema Registry] Serialization FAILED for topic 'myTopic', VALUE - Error: Schema being registered is incompatible with an earlier schema |
| 64 | +``` |
| 65 | + |
| 66 | +### Consumer Operation |
| 67 | +``` |
| 68 | +[Schema Registry] Retrieved schema from registry - Schema ID: 123, Type: Schema |
| 69 | +[Schema Registry] Consume from topic 'myTopic', schema for key: none, schema for value: 123, deserializing: VALUE |
| 70 | +``` |
| 71 | + |
| 72 | +## Supported Serialization Formats |
| 73 | + |
| 74 | +- **Avro** (via `KafkaAvroSerializer`/`KafkaAvroDeserializer`) |
| 75 | +- **Protobuf** (via `KafkaProtobufSerializer`/`KafkaProtobufDeserializer`) |
| 76 | + |
| 77 | +## Implementation Details |
| 78 | + |
| 79 | +### Instrumented Classes |
| 80 | + |
| 81 | +1. **CachedSchemaRegistryClient** - The main Schema Registry client |
| 82 | + - `register(String subject, Schema schema)` - Schema registration |
| 83 | + - `testCompatibility(String subject, Schema schema)` - Compatibility testing |
| 84 | + - `getSchemaById(int id)` - Schema retrieval |
| 85 | + |
| 86 | +2. **AbstractKafkaSchemaSerDe and subclasses** - Serializers |
| 87 | + - `serialize(String topic, Object data)` - Message serialization |
| 88 | + - `serialize(String topic, Headers headers, Object data)` - With headers (Kafka 2.1+) |
| 89 | + |
| 90 | +3. **AbstractKafkaSchemaSerDe and subclasses** - Deserializers |
| 91 | + - `deserialize(String topic, byte[] data)` - Message deserialization |
| 92 | + - `deserialize(String topic, Headers headers, byte[] data)` - With headers (Kafka 2.1+) |
| 93 | + |
| 94 | +### Context Management |
| 95 | + |
| 96 | +The `SchemaRegistryContext` class uses ThreadLocal storage to pass context between: |
| 97 | +- Serializer → Schema Registry Client (for logging topic information) |
| 98 | +- Deserializer → Schema Registry Client (for logging topic information) |
| 99 | + |
| 100 | +This allows the instrumentation to correlate schema operations with the topics they're associated with. |
| 101 | + |
| 102 | +## Usage |
| 103 | + |
| 104 | +This instrumentation is automatically activated when: |
| 105 | +1. Confluent Schema Registry client (version 7.0.0+) is present on the classpath |
| 106 | +2. The Datadog Java agent is attached to the JVM |
| 107 | + |
| 108 | +No configuration or code changes are required. |
| 109 | + |
| 110 | +## Metrics Access |
| 111 | + |
| 112 | +To access metrics programmatically: |
| 113 | + |
| 114 | +```java |
| 115 | +import datadog.trace.instrumentation.confluentschemaregistry.SchemaRegistryMetrics; |
| 116 | + |
| 117 | +// Get current counts |
| 118 | +long registrationFailures = SchemaRegistryMetrics.getSchemaRegistrationFailureCount(); |
| 119 | +long compatibilityFailures = SchemaRegistryMetrics.getSchemaCompatibilityFailureCount(); |
| 120 | +long serializationFailures = SchemaRegistryMetrics.getSerializationFailureCount(); |
| 121 | + |
| 122 | +// Print summary |
| 123 | +SchemaRegistryMetrics.printSummary(); |
| 124 | +``` |
| 125 | + |
| 126 | +## Monitoring Schema Compatibility Issues |
| 127 | + |
| 128 | +The primary use case for this instrumentation is to detect and monitor schema compatibility issues that cause production failures. By tracking `schemaRegistrationFailure` and `schemaCompatibilityFailure` metrics, you can: |
| 129 | + |
| 130 | +1. **Alert on schema compatibility failures** before they impact production |
| 131 | +2. **Track the rate of schema-related errors** per topic |
| 132 | +3. **Identify problematic schema changes** that break compatibility |
| 133 | +4. **Monitor serialization/deserialization failure rates** as a proxy for schema issues |
| 134 | + |
| 135 | +## Future Enhancements |
| 136 | + |
| 137 | +Potential additions: |
| 138 | +- JSON Schema serializer support (currently excluded due to dependency issues) |
| 139 | +- Schema evolution tracking |
| 140 | +- Schema version diff logging |
| 141 | +- Integration with Datadog APM for schema-related span tags |
0 commit comments