-
Notifications
You must be signed in to change notification settings - Fork 44
fix duplicate serialization of logicalType key for logical type schemas based on fixed type #395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I would add a check that the logical type of the inner schema is as expected. I also think that the real problem is in schema deserialisation, as that's probably where the duplicate attributes are created. This doesn't need to be fixed in this PR but is something we should look into. |
|
The bug in the parsing of the schemas. The logicalType should not be stored in the custom attributes. |
|
It's probably still a good idea to check if the inner type has a logicaltype attribute, and log a warning if that is the case. |
|
Ah, makes sense. It looks like excluding "logicalType" from custom attributes would be fairly easy to do. Would we expect that "logicalType" ever show up in the custom attributes, for example if it isn't a valid logical type, or the logical type is ignored because the rest of the schema is not valid for the given logical type? Or should logicalType just always be excluded from custom attributes, regardless of whether it is successfully interpreted as a logical type value? |
|
New update that prevents |
Co-authored-by: Kriskras99 <[email protected]>
|
Thanks for your contribution @jdarais! |
I was playing around a bit with the latest avro-rs (thanks for taking my last PR!) and I noticed that in some cases, the output of
serde_json::to_string(&schema)would result in thelogicalTypefield being written twice. It's not noticeable if you're reading the serialized json back using serde_json, since it just ignores the duplicate key, (which is why all the existing tests comparing schemas as structured json values still pass,) but it's possible that some json parsers would fail to parse such a schema.I eventually found that the duplicate was coming from the fact that in some cases,
logicalTypeshows up in the inner fixed schema's extra attributes, so it could be written once when we callfixed_schema.serialize_to_map::<S>(map), and then a second time when we callmap.serialize_entry("logicalType", "decimal').I'm not sure if the
Serializeimplementation forSchemais the place to fix it, or if we should be preventinglogicalTypefrom showing up in the inner fixed schema's attributes in the first place, but here's a PR that fixes it on the serialization side that you can merge if it looks good to you.I added a few unit tests that fail with the latest from main, and pass with the changes in this PR.