Skip to content

Conversation

Lestropie
Copy link
Owner

@Lestropie Lestropie commented Jun 23, 2025

This content is being posted as a draft Pull Request on a forked repository
in order to demonstrate the volume of changes generated
during the OHBM 2025 Hackathon.
There is further work to be done in augmenting / adding more datasets
before it can be submitted as a Pull Request to the bids-examples repository
within the bids-standard organisation:

  • Add key-value metadata to more datasets

    Particularly for those complex examples intended to demonstrate the correct order of metadata association,
    it would be advantageous to add metadata fields to the JSONs
    (even if some of those examples were first documeted elsewhere without any explicit metadata)
    to make sure that metadata precedence rules are being upheld.
    Ensure that for any datasets where such data is added, the exported override information is also consistent with manual expectation.

  • Add override metadata to applicable datasets

    In datasets that explicitly demonstrate the effects of key-value overloading,
    manually generate JSON files containing the expected overloads,
    then make sure that the output of the -o option matches those expectations.

  • Expand list of datasets

    The set of example datasets constructed thus far focus on the handling of JSON key-value metadata.
    JSON key-value files are however not the only form of metadata associated with data files in BIDS.
    Additional datasets should be constructed to demonstrate validate functionality such as:

    • Multiple applicable metadata files of a type other than .json

      (eg. bvec/bval files present both at BIDS root and as sidecars)

    • Metadata files of all different metadata extensions all present in a single dataset.

    • A metadata file that would be deemed applicable to some data file
      based exclusively on the entities & suffix of its file name,
      but that is ineligible for inheritance based on its location in the file system.

    • More realistic example of complex inheritance

      An appealing choice here is complex multi-echo fMRI data:
      Most metadata fields will be identical across all files;
      however "EchoTime" will vary between "_echo-#" files,
      and "Units" will vary between "_part-mag" and "_part-phase".

    • Tied inheritance for metadata type where only a unique nearest can be selected.

      I'm thinking of a scenario where a DWI does not have sidecar .bvec / .bval,
      there are two applicable .bvec / .bval pairs via inheritance,
      but they are in the same directory and have the same number of entities
      and therefore cannot be disambiguated.
      (The case where they differ in entity count and therefore can be disambiguated
      should also be included).

  • Use .bids-validator-config.json / .SKIP_VALIDATION to stop the BIDS Validator from complaining
    about presence of unrecognised metadata fields / illegitimate filesystem structures.

    While the examples could technically be modified to use valid BIDS fields,
    in some of these circumstances the metadata key names are used to communicate purpose
    in the context of Inheritance Principle demonstration / validation.
    The downside of this however is that attempting to run the BIDS validator on these datasets,
    including through the run_tests.sh command, will fail;
    unless this file can be specified in such a way that these Inheritance Principle-specific datasets
    don't otherwise upset the validator.

@Lestropie Lestropie self-assigned this Jun 23, 2025
Datasets relate to restrictions on the directories in which it is permissible to place metadata files depending on their applicability.
@Lestropie
Copy link
Owner Author

One possible concern here is the shear number of new datasets being proposed for addition. If the relevant tool under investigation were to undergo the requisite augmentations to be able to process each subject directory individually, then there would likely be several "datasets" that could be merged together. It however needs to be possible to process each scenario individually to ensure comprehensiveness of testing given that many tests involve checking for non-conformity.

@Lestropie
Copy link
Owner Author

Should run both pybids and validator against these new datasets, modifying BIDSVersion to simulate all versions of the specification. Any discrepancies between their outcomes and the expectations as defined in the test site here should be reported forward.

These datasets exemplify what happens when there are entity-linked file collections intersecting across multiple entities.
Utilise tailored metadata key-values to highlight not only which metadata files are inherited from for any given data file, but also the value that should be inherited when multiple applicable key-value metadata files contain different values for the same key.
Also add dataset "ipi1195v2", which has the filesystem structure of the example shown in Issue #1195, but includes key-value metadata that highlights the ambiguity that arises if trying to permit metadata overloading when there is full flexibility in applicable metadata files at a single filesystem hierarchy level.
@Lestropie
Copy link
Owner Author

Deprecated, now posted as PR to origin: bids-standard#504

@Lestropie Lestropie closed this Jul 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant