Skip to content

Conversation

@jdarais
Copy link
Contributor

@jdarais jdarais commented Jan 8, 2026

I was playing around a bit with the latest avro-rs (thanks for taking my last PR!) and I noticed that in some cases, the output of serde_json::to_string(&schema) would result in the logicalType field being written twice. It's not noticeable if you're reading the serialized json back using serde_json, since it just ignores the duplicate key, (which is why all the existing tests comparing schemas as structured json values still pass,) but it's possible that some json parsers would fail to parse such a schema.

I eventually found that the duplicate was coming from the fact that in some cases, logicalType shows up in the inner fixed schema's extra attributes, so it could be written once when we call fixed_schema.serialize_to_map::<S>(map), and then a second time when we call map.serialize_entry("logicalType", "decimal').

I'm not sure if the Serialize implementation for Schema is the place to fix it, or if we should be preventing logicalType from showing up in the inner fixed schema's attributes in the first place, but here's a PR that fixes it on the serialization side that you can merge if it looks good to you.

I added a few unit tests that fail with the latest from main, and pass with the changes in this PR.

@Kriskras99
Copy link
Contributor

I would add a check that the logical type of the inner schema is as expected.

I also think that the real problem is in schema deserialisation, as that's probably where the duplicate attributes are created. This doesn't need to be fixed in this PR but is something we should look into.

@martin-g
Copy link
Member

martin-g commented Jan 8, 2026

The bug in the parsing of the schemas. The logicalType should not be stored in the custom attributes.
The fixes in the non-test code in this PR should be reverted.

@Kriskras99
Copy link
Contributor

It's probably still a good idea to check if the inner type has a logicaltype attribute, and log a warning if that is the case.
Maybe we should check that the custom attributes don't contain any "reserved" keywords.

@jdarais
Copy link
Contributor Author

jdarais commented Jan 9, 2026

Ah, makes sense. It looks like excluding "logicalType" from custom attributes would be fairly easy to do. Would we expect that "logicalType" ever show up in the custom attributes, for example if it isn't a valid logical type, or the logical type is ignored because the rest of the schema is not valid for the given logical type? Or should logicalType just always be excluded from custom attributes, regardless of whether it is successfully interpreted as a logical type value?

@jdarais
Copy link
Contributor Author

jdarais commented Jan 11, 2026

New update that prevents logicalType from being parsed as a custom attribute in the first place. I believe this should work, unless there are cases when we do want to store logicalType as a custom attribute, (e.g. if it's unable to be interpreted as a valid logical type, and would be discarded otherwise.)

@Kriskras99 Kriskras99 merged commit 6fc1f96 into apache:main Jan 12, 2026
21 checks passed
@martin-g martin-g added this to the 0.22.0 milestone Jan 12, 2026
@Kriskras99
Copy link
Contributor

Thanks for your contribution @jdarais!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants