Skip to content

fix(avro): use standard SchemaNormalization for canonicalization and recompute stale hashes#7550

Open
AndreaScarselliAx wants to merge 11 commits intoApicurio:mainfrom
AndreaScarselliAx:feat/avro-standard-canonicalization
Open

fix(avro): use standard SchemaNormalization for canonicalization and recompute stale hashes#7550
AndreaScarselliAx wants to merge 11 commits intoApicurio:mainfrom
AndreaScarselliAx:feat/avro-standard-canonicalization

Conversation

@AndreaScarselliAx
Copy link
Copy Markdown
Contributor

Summary

Replace Apicurio Registry's custom Avro schema normalization logic with Apache Avro's standard SchemaNormalization.toParsingForm(), and add a database upgrader to recompute stale canonical
hashes for existing data.

Problem

With this contribution, I added a custom Avro canonicalizer. Later, while running v2, we found a bug: a ccompat lookup between the following two schemas returned a 40403.

{
    "type": "record",
    "name": "Application",
    "namespace": "nl.andrea.general",
    "doc": "Identification of an application",
    "fields": [
        {
            "name": "name",
            "type": "string",
            "doc": "The name of the application"
        },
        {
            "name": "version",
            "type": ["null", "string"],
            "doc": "(Optional) The application version"
        }
    ]
}
{
    "type": "record",
    "name": "Application",
    "namespace": "nl.andrea.general",
    "doc": "Identification of an application",
    "fields": [
        {
            "name": "name",
            "type": "string",
            "doc": "The name of the application"
        },
        {
            "name": "version",
            "type": ["null", "string"],
            "doc": "(Optional) The application version",
            "default": null
        }
    ]
}

These two schemas are semantically equivalent — "default": null. The custom canonicalizer was incorrectly producing different hashes for them. We found that SchemaNormalization.toParsingForm() from the Avro library handles this correctly out of the box, and switching to it prevents this class of bug entirely.

Changes

AvroContentCanonicalizer — replaced ~130 lines of custom normalization with SchemaNormalization.toParsingForm(). This is the standard Avro canonical form.

AvroCanonicalHashUpgrader — new DB upgrader (version 102) that recomputes canonical hashes for all existing Avro content. Without this, schemas registered before this change would retain stale hashes and still fail ccompat lookups against newly registered equivalent schemas.

Test fixes — AvroCompatibilityTest was incorrectly canonicalizing schemas before passing them to the compatibility checker, which stripped aliases and defaults needed for correct compatibility evaluation. Tests now parse schemas directly, matching production behavior. One assertion was also corrected: renaming a field with an alias is backward compatible per the Avro spec.

AndreaScarselliAx and others added 8 commits March 12, 2026 13:31
…EnhancedAvroContentCanonicalizer to AvroContentCanonicalizer
…ization

Replace 130 lines of custom normalization logic in AvroContentCanonicalizer
with Apache Avro's SchemaNormalization.toParsingForm(), which fixes known
bugs (nullable unions without default, self-referencing maps, enum defaults)
and reduces maintenance burden.

Add a DB upgrader (version 102) that recomputes canonical hashes for all
existing Avro content to match the new canonicalization output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@carlesarnal
Copy link
Copy Markdown
Member

@AndreaScarselliAx do you mind addressing the conflicts first? You'll need to pump the version of your upgrade.

@AndreaScarselliAx
Copy link
Copy Markdown
Contributor Author

Hi @carlesarnal , thanks for the feedback, I have completed the rebase

Copy link
Copy Markdown
Member

@carlesarnal carlesarnal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good

ContentWrapperDto data = ContentWrapperDto.builder()
.content(ContentHandle.create(entity.contentEntity.contentBytes))
.contentType(entity.contentEntity.contentType).references(references)
.artifactType(entity.contentEntity.contentType).build();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, the content type is not the same as the artifact type. The artifact type would be something like JSON or AVRO whereas the content type is the actual MIME type.


return ContentWrapperDto.builder().content(ContentHandle.create(entity.contentBytes))
.contentType(entity.contentType)
.references(RegistryContentUtils.deserializeReferences(entity.serializedReferences))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here regarding contentType/artifactType.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!


private final ArtifactTypeUtilProviderFactory factory = new DefaultArtifactTypeUtilProviderImpl(true);

@Override
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EricWittmann what do you think about transactionality here? The db version has been already bumped, so, if anything fails mid process, some rows will be upgraded and some don't. This should be shielded behind a transaction, no?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All DB operations done during DB upgrade should be part of the same Tx - so if anything fails the entire upgrade rolls back. Is that not how it always works?

@carlesarnal
Copy link
Copy Markdown
Member

@AndreaScarselliAx I'm a bit surprised by no test failing due to the error mentioned in the review. Would you be able to add a new test to cover that case? If you need some assistance, just let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants