-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Status/TODOs:
- already in effect and RECOMMENDED in BIDS 1.0 as of 1.10.1 thanks to
- change to REQUIRED for BIDS 2.0 . May be
- define on how it could be overridden "explicitly" (in some
overrides.jsonor field withindataset_description.json) to be explicit; although could be also done by bids-validator config kinda to suppress
- define on how it could be overridden "explicitly" (in some
It is a next step to the discussion which happened in
On a recent road-trip with @effigies we briefly discussed it and so far did not see a show stopper but it would require more minds to analyze.
ATM one of the problems of inheritance principle is unclear semantic in case of a value to be modified down the hierarchy: order can be unclear in case of multiple "candidate" files, unclear how to "remove" a value, etc.
And overall for a human it is cumbersome to "gather" the final value since for a file down the hierarchy someone needs to go through all possibly inherited files to arrive at the final value. But what if we take my suggestion in aforementioned issue further:
- retain ability to "chain" candidates for metadata from higher to lower levels as in current inheritance principle
- completely disallow overloading the value at lower (deeper in hierarchy) levels Corollaries:
- if present at different levels (e.g. entire dataset and then specific sidecar .json) - value must be identical/consistent across all levels of inheritance, or otherwise not given at any higher level
- if particular subject/session has some different value from the others as defined at higher (dataset) level, we need to remove that value from higher level and define at lower (e.g. subject/session) level
It will be a (now doable) job for a validator to ensure that all duplicated (across levels, if any) metadata is consistent.
As a result we would provide user a convenience that looking at top level metadata file provides a "guaranteed" correct metadata across all subject sessions, which is not the case currently as we can change it following the order of inheritance.
- FWIW, we already do something like that in heudiconv, where top level
task-*_bold.jsonfiles collate all identical values across subject/sessions -- makes it easy to see what is common (e.g. scanner ID etc) - Conceptually is what we have in BIDS ATM, e.g.
participants.tsvsummarizes metadata across participants and we expect it to be consistent with possible other phenotypic information to be found in subject/sessions. - It somewhat would allow for easier composition of Allow composition of a BIDS dataset (dataset level) from smaller (subj or subj/ses) level #59. Again -- metadata present on higher level would remain consistent with the lower, which would be easier to achieve (copy) and ensure (validator).
Attn @Lestropie as he has spent most time to improve Inheritance principle definition, and @dorahermes who is an active proponent and its user: do you think such "simplification" (removal of "value overload") of inheritance would simplify and remain usable? Or may be I do not see some common use case such additional "restriction" would disallow?
I think it might be worth writing some checker and apply it across all openneuro datasets to see if we run into such data "overloads". What would be a tool/functionality which implements inheritance principle already "closest to the bible", e.g. which pretty much would return a list of lists of .json/.tsv files in their "inherited" bundles? (specific code examples would be welcome)
Edits:
- might cause trouble with Allow composition of a BIDS dataset (dataset level) from smaller (subj or subj/ses) level #59 since we do need to overload value. But this file isn't really subject to inheritance principle in it's current formulation although is the information pertinent to all files
- @candleindark proposed: "make it that common metadata MUST be pulled up to the higher level"
- pros: would make it easier for human "users" of datasets.
- cons: might make it more difficult to curate such datasets since shifting metadata "down" would be needed more often.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status