-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
When deserializing Avro data with an enum value that doesn't exist in the reader's schema, the library violates the Avro specification.
Expected Behavior
According to the Avro specification:
if the writer’s symbol is not present in the reader’s enum and the reader has a default value, then that value is used, otherwise an error is signalled.
Therefore, when deserializing, if the writer’s enum symbol is missing from the reader’s enum:
- If the reader has a default enum value, that default should be used.
- If no default is defined, an error should be thrown.
Actual Behavior
The library currently returns the writer’s enum value even when it does not exist in the reader’s schema, ignoring the reader’s default value if present. Moreover, it does not throw an error when no default value is defined, violating the spec.
How to Reproduce
final class DefaultChoiceDTO
{
public function __construct(
public string $choice = "Unknown"
) {
}
}
$normalizers = [
new ObjectNormalizer(),
new PropertyNormalizer(),
new GetSetMethodNormalizer()
];
$encoders = [
new AvroSerDeEncoder($this->recordSerializer)
];
$this->serializer = new Serializer($normalizers, $encoders);
public function testAddEnumSymbolsWithDefaultForwardCompatible(): void
{
// V1 Schema (reader) - enum with 3 symbols and default
$schemaV1 = AvroSchema::parse(json_encode([
"type" => "record",
"name" => "Event",
"namespace" => "app.tests.dto",
"fields" => [[
"name" => "choice",
"type" => [
"type" => "enum",
"name" => "Choices",
"symbols" => ["Unknown", "First", "Second"],
"default" => "Unknown"
],
"default" => "Unknown"
]]
]));
// V2 Schema (writer) - enum with 4 symbols
$schemaV2 = AvroSchema::parse(json_encode([
"type" => "record",
"name" => "Event",
"namespace" => "app.tests.dto",
"fields" => [[
"name" => "choice",
"type" => [
"type" => "enum",
"name" => "Choices",
"symbols" => ["Unknown", "First", "Second", "Third"],
"default" => "Unknown"
],
"default" => "Unknown"
]]
]));
$data = new \App\Tests\DTO\DefaultChoiceDTO("Third");
// Serialize with V2 schema
$serialized = $this->serializer->serialize(
$data,
AvroSerDeEncoder::FORMAT_AVRO,
[
AvroSerDeEncoder::CONTEXT_ENCODE_SUBJECT => $subject,
AvroSerDeEncoder::CONTEXT_ENCODE_WRITERS_SCHEMA => $schemaV2,
]
);
// Deserialize with V1 schema
$deserialized = $this->serializer->deserialize(
$serialized,
\App\Tests\DTO\DefaultChoiceDTO::class,
AvroSerDeEncoder::FORMAT_AVRO,
[AvroSerDeEncoder::CONTEXT_DECODE_READERS_SCHEMA => $schemaV1]
);
// This assertion fails:
// Expected: "Unknown" (the default)
// Actual: "Third" (the writer's value)
$this->assertEquals("Unknown", $deserialized->choice);
}keichinger
Metadata
Metadata
Assignees
Labels
No labels