Skip to content

Feature/metadata improvements#23

Merged
fbriol merged 10 commits intodevelopfrom
feature/metadata_improvements
May 27, 2025
Merged

Feature/metadata improvements#23
fbriol merged 10 commits intodevelopfrom
feature/metadata_improvements

Conversation

@Thomas-Z
Copy link
Collaborator

@Thomas-Z Thomas-Z commented Mar 2, 2025

These changes are related to the addition of dimension information in the zcollection metadata.
The motivations behind these changes are as follows:

Prevent the creation of unreadable zcollections when adding a new partition with dimensions different from the original one.
Improve validation of inserted data.
Clean up some code that templated existing variables to obtain dimension sizes.
Enable new features made possible by this additional metadata.

These changes will require existing collections to be updated before new data can be written.

The following elements are implemented in this initial commit:

  • Additional information in the collection metadata file:
    • zcollection version number
    • Known dimension sizes and chunking
  • Data validation on insert:
    • Checks on dimension properties
    • Checks on variable data types and fill values
  • Support for dropping immutable variables
  • Addition of a Collection.add_dimension method
  • Update of the update_deprecated_collection function
  • Handling of non-updated read-only collections

The following tasks remain to be done:

  • Complete the list of changes
  • Add tests:
    • For Collection.add_dimension
    • Reading subsets of view variables (only reference, only view, mix)
    • Adding immutable variables
    • Removing immutable variables
  • Restrict the ability to update an immutable variable
  • Restrict the ability to add immutable variables to a view
  • Finalize and fully test the deprecated collection and view update functions

@Thomas-Z Thomas-Z added the enhancement New feature or request label Mar 2, 2025
@Thomas-Z Thomas-Z requested a review from fbriol March 2, 2025 17:19
@Thomas-Z Thomas-Z self-assigned this Mar 2, 2025
@Thomas-Z Thomas-Z changed the base branch from main to develop March 2, 2025 17:20
@Thomas-Z Thomas-Z linked an issue Mar 2, 2025 that may be closed by this pull request
Thomas Zilio added 9 commits March 9, 2025 17:10
Adding tests
Refactoring and cleaning.
Making collection only containing immutable variables readable.
Making view readable even if they do not have any declared variable (just reading the reference).
Consolidating tests and refactoring.
Adding tests related to immutable variables presence in the dataset provided to update and map methods.
Fixing drop_partitions tests (with timedelta)
Normalizing zcol/zview naming in tests.
…ate() methods.

Adding the "filler" variable concept to differentiate variables added by the system and the one added by the user during insertion.
Fixing partition order output of _normalize_partitions
Removing "over typing".
… the default value of delayed (False if distributed is False, True otherwise).
@fbriol fbriol marked this pull request as ready for review May 27, 2025 19:27
@fbriol fbriol merged commit ba81a8f into develop May 27, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ValueError when loading a 'bad' set of variables from a view

2 participants