Replace "inheritance" with "summarization" principle

Status/TODOs:
- [x] already in effect and RECOMMENDED in BIDS 1.0 as of [1.10.1](https://bids-specification.readthedocs.io/en/v1.10.1/common-principles.html#the-inheritance-principle) thanks to
   - https://github.com/bids-standard/bids-specification/pull/1834 
- [ ] change to REQUIRED for BIDS 2.0 . May be
  - [ ] define on how it could be overridden "explicitly" (in some `overrides.json` or field within `dataset_description.json`) to be explicit; although could be also done by bids-validator config kinda to suppress 

It is a next step to the discussion which happened in
- https://github.com/bids-standard/bids-2-devel/issues/36

On a recent road-trip with @effigies we briefly discussed it and so far did not see a show stopper but it would require more minds to analyze.

ATM one of the problems of inheritance principle is unclear semantic in case of a value to be modified down the hierarchy: order can be unclear in case of multiple "candidate" files, unclear how to "remove" a value, etc.
And overall for a human it is cumbersome to "gather" the final value since for a file down the hierarchy someone needs to go through all possibly inherited files to arrive at the final value.  But what if we take my [suggestion in aforementioned issue](https://github.com/bids-standard/bids-2-devel/issues/36#issuecomment-1967996359) further: 

- retain ability to "chain" candidates for metadata from higher to lower levels as in current inheritance principle
- **completely disallow overloading the value at lower (deeper in hierarchy) levels** Corollaries:
  - if present at different levels (e.g. entire dataset and then specific sidecar .json) - **value must be identical/consistent across all levels of inheritance, or otherwise not given at any higher level**
  - if particular subject/session has some different value from the others as defined at higher (dataset) level, we need to remove that value from higher level and define at lower (e.g. subject/session) level   

It will be a (now doable) job for a validator to ensure that all duplicated (across levels, if any) metadata is consistent.

As a result we would provide user a convenience that looking at top level metadata file provides a "guaranteed" correct metadata across all subject sessions, which is not the case currently as we can change it following the order of inheritance.

- FWIW, we already do something like that in heudiconv, where top level `task-*_bold.json` files collate all identical values across subject/sessions -- makes it easy to see what is common (e.g. scanner ID etc)
- Conceptually is what we have in BIDS ATM, e.g. `participants.tsv` summarizes metadata across participants and we expect it to be consistent with possible other phenotypic information to be found in subject/sessions.
  - Hence I think it also relates to BEP036 (Phenotypic Data Guidelines), attn @surchs @ericearl (I just now created @bids-standard/bep036 team) where the idea circles to be able to "segregate" metadata into subject/session level while keeping consistently in the top level (under `phetotype/` folder).
- It somewhat would allow  for easier composition of #59. Again -- metadata present on higher level would remain consistent with the lower, which would be easier to achieve (copy) and ensure (validator).

Attn @Lestropie as he has spent most time to improve Inheritance principle definition, and @dorahermes who is an active proponent and its user: do you think such "simplification" (removal of "value overload") of inheritance would simplify and remain usable?  Or may be I do not see some common use case such additional "restriction" would disallow?

I think it might be worth writing some checker and apply it across all openneuro datasets to see if we run into such data "overloads".  What would be a tool/functionality which implements inheritance principle already "closest to the bible", e.g. which pretty much would return a list of lists of .json/.tsv files in their "inherited" bundles? (specific code examples would be welcome) 

Edits: 
- might cause trouble with #59 since we do need to overload value. But this file isn't really subject to inheritance principle in it's current formulation although is the information pertinent to all files
- @candleindark proposed: "make it that common metadata MUST be pulled up to the higher level"
  - pros: would make it easier for human "users" of datasets.
  - cons: might make it more difficult to curate such datasets since shifting metadata "down" would be needed more often. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace "inheritance" with "summarization" principle #65

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Replace "inheritance" with "summarization" principle #65

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions