Skip to content

Commit 7dd01e6

Browse files
committed
Minimal example for study dataset
1 parent b0db166 commit 7dd01e6

File tree

2 files changed

+100
-10
lines changed

2 files changed

+100
-10
lines changed

src/modality-agnostic-files/provenance.md

Lines changed: 96 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ For the most part, this metadata consists of **provenance records** of 4 types:
7171

7272
Provenance records are described as JSON objects in BIDS. They are stored inside **provenance files** (see [Provenance files](#provenance-files)).
7373

74-
Additionally, provenance metadata of entities can be stored inside:
74+
Additionally, **provenance metadata** of entities can be stored as regular BIDS metadata inside:
7575

7676
- sidecar JSON files (see [Provenance of a BIDS file](#provenance-of-a-bids-file));
7777
- `dataset_description.json` files (see [Provenance of a BIDS dataset](#provenance-of-a-bids-dataset)).
@@ -476,9 +476,11 @@ The uniqueness of this identifier MUST be used to distinguish any activity, soft
476476
- `bids::prov#fedora-uldfv058` - a Fedora based environment described inside the current dataset.
477477
- `bids:preprocessing:prov#fmriprep-r4kzzMt8` - the fMRIPrep software described inside the `preprocessing` dataset.
478478

479-
## Minimal example
479+
## Minimal examples
480480

481-
Here is a comprehensive example that considers the following dataset:
481+
### Provenance of a BIDS raw dataset
482+
483+
Consider the following BIDS raw dataset:
482484

483485
<!-- This block generates a file tree.
484486
A guide for using macros can be found at
@@ -504,7 +506,7 @@ A guide for using macros can be found at
504506
}
505507
) }}
506508

507-
The following provenance record is defined in `prov/prov-dcm2niix_soft.json`. As mentioned in the [Consistency and uniqueness of Ids](#consistency-and-uniqueness-of-ids) section, its identifier SHOULD start with `bids:<dataset>:prov#` (here, `bids::` refers to the current dataset).
509+
Here are the contents of the `prov/prov-dcm2niix_soft.json` file:
508510

509511
```JSON
510512
{
@@ -518,7 +520,9 @@ The following provenance record is defined in `prov/prov-dcm2niix_soft.json`. As
518520
}
519521
```
520522

521-
The previously described software record is referred to in the `prov/prov-dcm2niix_act.json` file:
523+
A software package is described using a provenance record inside the `Software` array. As mentioned in the [Consistency and uniqueness of identifiers](#consistency-and-uniqueness-of-identifiers) section, its identifier SHOULD start with `bids:<dataset>:prov#` (here, `bids::` refers to the current dataset).
524+
525+
Here are the contents of the `prov/prov-dcm2niix_act.json` file:
522526

523527
```JSON
524528
{
@@ -533,10 +537,96 @@ The previously described software record is referred to in the `prov/prov-dcm2ni
533537
}
534538
```
535539

536-
The previously described activity record is referred to in the `sub-001/anat/sub-001_T1w.json` sidecar JSON file:
540+
An activity is described using a provenance record inside the `Activities` array. Note that the identifier of the previously described software package is used here to describe that the software package was associated with this activity.
541+
542+
Here are the contents of the `sub-001/anat/sub-001_T1w.json` file:
537543

538544
```JSON
539545
{
540546
"GeneratedBy": "bids::prov#conversion-00f3a18f"
541547
}
542548
```
549+
550+
The provenance metadata `GeneratedBy` indicates that the `sub-001/anat/sub-001_T1w.nii.gz` file was generated by the previously described activity.
551+
552+
### Provenance of a BIDS study dataset
553+
554+
Consider the following BIDS study dataset:
555+
556+
<!-- This block generates a file tree.
557+
A guide for using macros can be found at
558+
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
559+
-->
560+
{{ MACROS___make_filetree_example(
561+
{
562+
"study-1": {
563+
"sourcedata": {
564+
"raw": {
565+
"sub-01": {},
566+
"sub-02": {},
567+
"...": ""
568+
}
569+
},
570+
"derivatives": {
571+
"fmriprep": {}
572+
},
573+
"prov": {
574+
"prov-fmriprep_act.json": "",
575+
"prov-fmriprep_ent.json": ""
576+
},
577+
"dataset_description.json": "",
578+
"...": ""
579+
}
580+
}
581+
) }}
582+
583+
Here are the contents of the `dataset_description.json` file for the study dataset:
584+
585+
```JSON
586+
{
587+
...
588+
"DatasetLinks": {
589+
"raw": "sourcedata/raw",
590+
"fmriprep": "derivatives/fmriprep"
591+
}
592+
}
593+
```
594+
595+
Dataset names are defined in order to refer to two nested datasets using BIDS URIs.
596+
597+
Here are the contents of the `prov/prov-fmriprep_ent.json` file:
598+
599+
```JSON
600+
{
601+
"Entities": [
602+
{
603+
"Id": "bids:raw",
604+
"Label": "Raw data"
605+
},
606+
{
607+
"Id": "bids:fmriprep",
608+
"Label": "Preprocessed data",
609+
"GeneratedBy": "bids::prov#preprocessing-00f3a18f"
610+
},
611+
]
612+
}
613+
```
614+
615+
Two entities are described inside the `Entities` array, using one provenance record per nested dataset.
616+
617+
Here are the contents of the `prov/prov-fmriprep_act.json` file:
618+
619+
```JSON
620+
{
621+
"Activities": [
622+
{
623+
"Id": "bids::prov#preprocessing-00f3a18f",
624+
"Label": "Preprocessing with fMRIprep",
625+
"Command": "docker run -v sourcedata/raw:/data:ro -v derivatives/fmriprep:/out poldracklab/fmriprep:1.1.4 /data /out",
626+
"Used": "bids:raw"
627+
}
628+
]
629+
}
630+
```
631+
632+
An activity is described using a provenance record inside the `Activities` array.

src/schema/objects/metadata.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ Activities:
6262
name: Id
6363
description: |
6464
Identifier for the activity
65-
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-ids)
65+
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-identifiers)
6666
section).
6767
type: string
6868
format: uri
@@ -1268,7 +1268,7 @@ Entities:
12681268
name: Id
12691269
description: |
12701270
Identifier for the entity
1271-
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-ids)
1271+
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-identifiers)
12721272
section).
12731273
type: string
12741274
format: uri
@@ -1327,7 +1327,7 @@ Environments:
13271327
name: Id
13281328
description: |
13291329
Identifier for the environment.
1330-
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-ids)
1330+
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-identifiers)
13311331
section).
13321332
type: string
13331333
format: uri
@@ -3715,7 +3715,7 @@ Software:
37153715
name: Id
37163716
description: |
37173717
Identifier for the software package
3718-
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-ids)
3718+
(see the [Consistency and uniqueness of Ids](SPEC_ROOT/modality-agnostic-files/provenance.md#consistency-and-uniqueness-of-identifiers)
37193719
section).
37203720
type: string
37213721
format: uri

0 commit comments

Comments
 (0)