Skip to content

feat: support unknown fields#471

Merged
anthony-swirldslabs merged 13 commits intomainfrom
456-unknownFields
Apr 29, 2025
Merged

feat: support unknown fields#471
anthony-swirldslabs merged 13 commits intomainfrom
456-unknownFields

Conversation

@anthony-swirldslabs
Copy link
Copy Markdown
Contributor

@anthony-swirldslabs anthony-swirldslabs commented Apr 17, 2025

Description:
Introducing support for unknown fields. The default behavior is unchanged and PBJ skips unknown fields by default. To activate the new behavior, the parseUnknownFields argument of the Codec.parse() method needs to be set to true. This enables the parser to collect all the known fields in a map in the model class. Later on, the codec can use this map to serialize all the unknown fields if they're present. Also, applications are able to examine the unknown fields by calling model.getUnknownFields() to get an unmodifiable map with the fields.

Revision 2:

  • Addressing comments - mainly replacing the map with a list, adding docs, and an extra round-trip test.
  • Fixing minor bugs: in protobufSize() to account for unknown fields, and in copyBuilder() to account for them as well.

Revision 3:

  • Adding unknown fields support to equals/hashCode/toString/compareTo
  • Sorting the fields when parsing to ensure determinism of the above, as well as of the write()
  • Optimizing the initial size of the unknown fields array to help speed up parsing
  • Adding a comprehensive round-trip test that uses the Everything model and hence verifies the round trip for all the types that PBJ supports.
    Also note: in Google Protobuf, the JSON format officially doesn't support unknown fields. So PBJ won't support them in its JSONCodec as well. See https://protobuf.dev/programming-guides/proto3/#retaining

Revision 4:

  • Updated the javadoc for all public methods in the UnknownField record.

Related issue(s):

Fixes #456

Notes for reviewer:
A new integ test is added.

Checklist

  • Documented (Code comments, README, etc.)
  • Tested (unit, integration, etc.)

Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 17, 2025

JUnit Test Report

   68 files  ±0     68 suites  ±0   2m 45s ⏱️ -5s
1 273 tests ±0  1 270 ✅ ±0   3 💤 ±0  0 ❌ ±0 
7 128 runs  ±0  7 109 ✅ ±0  19 💤 ±0  0 ❌ ±0 

Results for commit 8a1ebfa. ± Comparison against base commit 82551d2.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 17, 2025

Integration Test Report

    397 files  + 3      397 suites  +3   14m 28s ⏱️ - 2m 3s
114 798 tests +17  114 798 ✅ +17  0 💤 ±0  0 ❌ ±0 
115 026 runs  +17  115 026 ✅ +17  0 💤 ±0  0 ❌ ±0 

Results for commit 8a1ebfa. ± Comparison against base commit 82551d2.

♻️ This comment has been updated with latest results.

Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Copy link
Copy Markdown
Member

@jasperpotts jasperpotts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also model generator, toString, equals, compareTo and hashCode should all include unknown fields.

Also do we test unknown oneOf values?

Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
@anthony-swirldslabs
Copy link
Copy Markdown
Contributor Author

Also model generator, toString, equals, compareTo and hashCode should all include unknown fields.

Done in latest commits.

Also do we test unknown oneOf values?

The "oneOf" is a model-level concept. On the wire, oneOf values are no different from regular fields, and the wire format doesn't tell if it's a standalone field or a part of a oneOf. One has to know the latest model in order to interpret the value as a oneOf. In case of unknown fields, we don't know the latest model by definition, and therefore, we cannot treat oneOfs specially. They are just added to the unknown fields list as regular fields.

Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
jasperpotts
jasperpotts previously approved these changes Apr 26, 2025
Copy link
Copy Markdown
Member

@jasperpotts jasperpotts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all good, just please check the handling of unknown fields with default values and think through stable hashcode and equals needed for use of PBJ model objects as keys in DB. While needing to be able to add fields to those objects over time and the keys still match. Maybe a unit test to make sure that is true and stays true would be good.

@jasperpotts
Copy link
Copy Markdown
Member

It would be great to have Artem and Joseph review this PR as well because there are so many edge details to remember like stable hashcode.

Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
@anthony-swirldslabs anthony-swirldslabs merged commit 8c7c852 into main Apr 29, 2025
10 checks passed
@anthony-swirldslabs anthony-swirldslabs deleted the 456-unknownFields branch April 29, 2025 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support unknown fields

5 participants