Gate dict gdal_metadata behind the experimental rich-tag opt-in#3327
Merged
Conversation
A fresh DataArray carrying attrs['gdal_metadata'] as a dict could write arbitrary GDAL metadata to the on-disk TIFF without allow_experimental_codecs, because _validate_write_rich_tag_optin only checked gdal_metadata_xml and extra_tags. _extract_rich_tags builds the GDAL XML from a dict gdal_metadata, so add it to the gate (dict only; non-dict values are ignored by the writer). The round-trip exemption for attrs carrying the contract marker is unchanged. Update pack/scale-offset test setups that wrote a fresh gdal_metadata dict to pass the opt-in, and note the dict case on the writer.gdal_metadata_xml row in the release contract.
brendancol
commented
Jun 14, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Gate dict gdal_metadata behind the experimental rich-tag opt-in
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
- The gate fires on a dict
attrs['gdal_metadata']even whenattrs['gdal_metadata_xml']is also present. In that case_extract_rich_tags(_attrs.py:1474-1478) prefers the XML and never builds from the dict, so the dict alone wouldn't reach disk. This is harmless: withgdal_metadata_xmlpresent the gate already fires for it, so the write is blocked either way, and a message listing both attrs is accurate. No change needed; noting it so the interaction is on record. - A test for the "both
gdal_metadata_xmland dictgdal_metadatapresent, no flag" case would pin the message wording, but the existinggdal_metadata_xmlgate test already covers the reject path. Low value.
What looks good
- The trigger
isinstance(attrs.get('gdal_metadata'), dict)mirrors_extract_rich_tagsexactly, so the gate fires for precisely the inputs that would otherwise write GDAL XML to disk. Non-dict values stay ungated, matching the writer. - The round-trip exemption is untouched: the
_xrspatial_geotiff_contractearly return runs before the triggered checks, so read-back arrays still write flag-free, and there's a direct test for it. - The gate is shared by the eager/dask CPU writer and the GPU writer, so all four backends enforce it.
- Test fallout was handled at the source: pack and scale-offset test helpers that build a fresh
gdal_metadatadict now pass the opt-in, rather than weakening the gate. Full geotiff suite is green.
Checklist
- Matches the write trigger in
_extract_rich_tags - All four backends enforce the gate (shared validator)
- Round-trip exemption preserved and tested
- Edge cases covered (non-dict ignored, accepted with flag)
- No premature materialization or copies (validation-only change)
- Benchmark not needed (input validation)
- README feature matrix not applicable (no new function)
- Docstrings/contract doc updated (release contract row)
brendancol
commented
Jun 14, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review
Addressed the review nits:
- Added
test_validate_write_rich_tag_optin_names_both_xml_and_gdal_metadatato pin the rejection message when bothgdal_metadata_xmland a dictgdal_metadataare present without the opt-in (the second nit). - The first nit (gate also flags the dict when xml is present) is dismissed as a non-defect:
_extract_rich_tagsprefers the XML, the write is already blocked by thegdal_metadata_xmltrigger, and listing both attrs in the message is accurate. No code change.
No remaining blockers or suggestions. The gate logic, round-trip exemption, and backend coverage are unchanged from the first pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3320
Summary
to_geotiffgatedattrs['gdal_metadata_xml']andattrs['extra_tags']behindallow_experimental_codecs=True, but notattrs['gdal_metadata']. Since_extract_rich_tagsbuilds GDAL XML from a dictgdal_metadataand writes it to disk, a fresh DataArray could write arbitrary GDAL metadata without opting in. This closes that hole.gdal_metadatacheck to_validate_write_rich_tag_optin, firing only when the value is a dict (a non-dict value is ignored by the writer, so it stays ungated)._xrspatial_geotiff_contract) is unchanged, so canonical round-trips stay flag-free._extract_rich_tagsis untouched.Backend coverage
The gate lives in
_validate_write_rich_tag_optin, shared by the eager/dask CPU writer and the GPU writer, so all four backends enforce it.Test plan
_validate_write_rich_tag_optin: dictgdal_metadatarejected without the flag, non-dict ignored, accepted with the flag, round-trip marker still exempt.to_geotiff: fresh DataArray with dictgdal_metadataraisesValueErrorwithout the opt-in and writes with it; a read-back array writes flag-free.gdal_metadatadict to pass the opt-in.