Skip to content

Commit 7f5e0a7

Browse files
authored
Merge pull request #1781 from Remi-Gau/enh/1748
[ENH] Add `matches` and `source_entity` to glossary and file templates
2 parents 3b6d2f7 + 4fa41b7 commit 7f5e0a7

File tree

9 files changed

+217
-48
lines changed

9 files changed

+217
-48
lines changed

src/metaschema.json

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,16 @@
243243
},
244244
"additionalProperties": false
245245
},
246+
"metaentities": {
247+
"type": "object",
248+
"patternProperties": {
249+
"^[a-zA-Z0-9_]+$": {
250+
"$ref": "#/definitions/termTypes/general",
251+
"unevaluatedProperties": false
252+
},
253+
"additionalProperties": false
254+
}
255+
},
246256
"modalities": {
247257
"type": "object",
248258
"patternProperties": {
@@ -288,6 +298,7 @@
288298
"files",
289299
"formats",
290300
"metadata",
301+
"metaentities",
291302
"modalities",
292303
"suffixes"
293304
],
@@ -430,6 +441,10 @@
430441
},
431442
"additionalProperties": false
432443
},
444+
"metaentities": {
445+
"type": "array",
446+
"items": { "type": "string" }
447+
},
433448
"common_principles": {
434449
"type": "array",
435450
"items": { "type": "string" }
@@ -486,6 +501,7 @@
486501
"json",
487502
"sidecars",
488503
"tabular_data",
504+
"metaentities",
489505
"common_principles",
490506
"dataset_metadata",
491507
"directories",

src/modality-specific-files/physiological-recordings.md

Lines changed: 7 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -14,23 +14,13 @@ JSON file for storing metadata fields (see below).
1414
- [`7t_trt`](https://github.com/bids-standard/bids-examples/tree/master/7t_trt)
1515
- [`ds210`](https://github.com/bids-standard/bids-examples/tree/master/ds210)
1616

17-
Template:
18-
19-
```Text
20-
sub-<label>/[ses-<label>/]
21-
<datatype>/
22-
<matches>[_recording-<label>]_physio.tsv.gz
23-
<matches>[_recording-<label>]_physio.json
24-
```
25-
26-
For the template directory name, `<datatype>` can correspond to any data
27-
recording modality, for example `func`, `anat`, `dwi`, `meg`, `eeg`, `ieeg`,
28-
or `beh`.
29-
30-
In the template filenames, the `<matches>` part corresponds to task filename
31-
before the suffix.
32-
For example for the file `sub-control01_task-nback_run-1_bold.nii.gz`,
33-
`<matches>` would correspond to `sub-control01_task-nback_run-1`.
17+
{{ MACROS___make_filename_template(
18+
"raw",
19+
placeholders=True,
20+
show_entities=["recording"],
21+
suffixes=["physio"]
22+
)
23+
}}
3424

3525
!!! warning "Caution"
3626

src/modality-specific-files/task-events.md

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,12 @@ are supported (in contrast to "block" designs) - each "block of events" can be
1111
represented by an individual row in the `events.tsv` file (with a long
1212
duration).
1313

14-
Template:
15-
16-
```Text
17-
sub-<label>/[ses-<label>]
18-
<data_type>/
19-
<matches>_events.tsv
20-
<matches>_events.json
21-
```
22-
23-
Where `<matches>` corresponds to task filename. For example:
24-
`sub-control01_task-nback`.
14+
{{ MACROS___make_filename_template(
15+
"raw",
16+
placeholders=True,
17+
suffixes=["events"]
18+
)
19+
}}
2520

2621
Each task events file REQUIRES a corresponding task data file.
2722
It is also possible to have a single `events.tsv` file describing events

src/schema/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -369,6 +369,7 @@ The namespaces are:
369369
| --------------------------- | ----------------------------------------------------------------------------------- | ---------------- |
370370
| `objects.common_principles` | Terms that are used throughout BIDS | General terms |
371371
| `objects.modalities` | Broad categories of data represented in BIDS, roughly matching recording instrument | General terms |
372+
| `objects.metaentities` | Placeholders and wildcards to reduce verbosity of some templates in BIDS | General terms |
372373
| `objects.entities` | Name-value pairs appearing in filenames | Name/value terms |
373374
| `objects.metadata` | Name-value pairs appearing in JSON files | Name/value terms |
374375
| `objects.columns` | Column headings and values appearing in TSV files | Name/value terms |
@@ -505,6 +506,12 @@ The convention can be summed up in the following rules:
505506
| `display_name` | Human-friendly name |
506507
| `description` | Term definition |
507508

509+
- `objects.metaentities`
510+
| Field | Description |
511+
| -------------- | ------------------- |
512+
| `display_name` | Human-friendly name |
513+
| `description` | Term definition |
514+
508515
- `objects.entities`
509516

510517
| Field | Description |
@@ -1004,6 +1011,9 @@ EventsMissing:
10041011
- `rules.common_principles` - This file contains a list of terms that appear in `objects.common_principles`
10051012
that determines the order they appear in the specification
10061013

1014+
- `rules.metaentities` - This file contains a list of terms that appear in `objects.metaentities`
1015+
that determines the order they appear in the specification
1016+
10071017
### One-off rules
10081018

10091019
- `rules.modalities` - The keys in this file are the modalities, the values objects with the following field:
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
# This file describes metaentities (generic placeholder designations for one or more entities)
3+
# present in BIDS filenames.
4+
# WARNING: The metaentities are presented here in alphabetical order!
5+
# The appropriate order of entities in filenames is defined in `rules/entities.yaml`,
6+
# rather than this file (`objects/entities.yaml`).
7+
#
8+
# Example of wildcard "any" that could be used to denote all available data types:
9+
#
10+
# any:
11+
# name: any
12+
# display_name: any
13+
# description: |
14+
# `any` is used as a wildcard in BIDS filenames templates to denote
15+
# that any acceptable value for the corresponding entity match the
16+
# template pattern.
17+
18+
# For example, if employed in replacement of the `data_type` entity:
19+
20+
# ```Text
21+
# sub-<label>/
22+
# [ses-<label>/]
23+
# <any>/
24+
# sub-<label>_[ses-<label>]_events.<extension>
25+
# ```
26+
27+
# this indicates that the filename pattern `sub-<label>_[ses-<label>]_events.<extension>`
28+
# can be placed under any of the valid `data_type` directories.
29+
30+
matches:
31+
display_name: matches
32+
description: |
33+
`matches` is used as a placeholder in BIDS filenames templates to denote that several files
34+
share exactly the same sequence of entities and labels in their basename.
35+
36+
For example, in the following filename template:
37+
38+
```Text
39+
<matches>_bold.nii.gz
40+
<matches>_events.tsv
41+
<matches>_events.json
42+
```
43+
44+
`<matches>` could correspond to `sub-control01_task-nback_run-1`.
45+
46+
source_entities:
47+
display_name: source entities
48+
description: |
49+
`source_entities` is used as a placeholder in BIDS derivatives filenames templates.
50+
51+
`source_entities` MUST be the entire source filename, with the omission of
52+
the source suffix and extension.
53+
One exception to this rule is filename entities that are no longer relevant.

src/schema/rules/metaentities.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
# This file simply defines the order in which the meta-entities and placeholders are presented in the specification.
3+
# The actual term definitions appear in `objects/metaentities.yaml`.
4+
- matches
5+
- source_entities

tools/schemacode/src/bidsschematools/data/tests/test_rules.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ def test_rule_objects(schema_obj):
9191
"files",
9292
"formats",
9393
"metadata",
94+
"metaentities",
9495
"modalities",
9596
]:
9697
# But other object types are referenced via their keys

tools/schemacode/src/bidsschematools/render/text.py

Lines changed: 56 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
"files": "files and directories",
2222
"formats": "format",
2323
"metadata": "metadata",
24+
"metaentities": "meta-entity",
2425
"top_level_files": "top level file",
2526
"suffixes": "suffix",
2627
}
@@ -150,12 +151,6 @@ def make_glossary(schema, src_path=None):
150151
obj_desc = obj_def.get("description", None)
151152
if obj_desc is None:
152153
raise ValueError(f"{obj_marker} has no description.")
153-
# A backslash before a newline means continue a string
154-
obj_desc = obj_desc.replace("\\\n", "")
155-
# Two newlines should be respected
156-
obj_desc = obj_desc.replace("\n\n", "<br>")
157-
# Otherwise a newline corresponds to a space
158-
obj_desc = obj_desc.replace("\n", " ")
159154

160155
text += f'\n<a name="{obj_marker}"></a>'
161156
text += f"\n## {obj_key}\n\n"
@@ -183,7 +178,9 @@ def make_glossary(schema, src_path=None):
183178
levels = [level["name"] if isinstance(level, dict) else level for level in levels]
184179
text += f"**Allowed values**: `{'`, `'.join(levels)}`\n\n"
185180

186-
text += f"**Description**:\n{obj_desc}\n\n"
181+
# Convert description into markdown and append to text
182+
obj_desc = MarkdownIt().render(f"**Description**:\n{obj_desc}")
183+
text += f"{obj_desc}\n\n"
187184

188185
reduced_obj_def = {k: v for k, v in obj_def.items() if k not in keys_to_drop}
189186

@@ -235,6 +232,9 @@ def make_filename_template(
235232
src_path=None,
236233
n_dupes_to_combine=6,
237234
pdf_format=False,
235+
placeholders=False,
236+
empty_dirs=None,
237+
show_entities=tuple(),
238238
**kwargs,
239239
):
240240
"""Create codeblocks containing example filename patterns for a given datatype.
@@ -262,6 +262,20 @@ def make_filename_template(
262262
If False, the filename template will use HTML and include hyperlinks.
263263
This works on the website.
264264
Default is False.
265+
placeholders : bool, optional
266+
If True, placeholder meta-entities will replace keyword-value entities in the
267+
filename.
268+
If ``dstype`` is ``"raw"``, the placeholder meta-entity is ``<matches>``.
269+
If ``dstype`` is ``"derivatives"``, the placeholder meta-entity is ``<source_entities>``.
270+
Default is False.
271+
empty_dirs: bool, optional
272+
If False, empty datatype directories are not included. If ``placeholders`` is True,
273+
this option is set False.
274+
Default is True.
275+
show_entities: tuple, optional
276+
If ``placeholders`` is ``False`` this argument is ignored.
277+
When using placeholders, this argument can be set to a list or tuple of entity
278+
names that will be "extracted" out of the placeholder.
265279
266280
Other Parameters
267281
----------------
@@ -313,19 +327,39 @@ def make_filename_template(
313327
for datatype in rule.datatypes:
314328
file_groups.setdefault(datatype, []).append(rule)
315329

330+
if empty_dirs is None:
331+
empty_dirs = not placeholders
332+
333+
entity_list = schema.rules.entities
334+
start_string = ""
335+
if placeholders:
336+
metaentity_name = "matches" if dstype == "raw" else "source_entities"
337+
start_string = (
338+
lt
339+
+ utils._link_with_html(
340+
metaentity_name,
341+
html_path=GLOSSARY_PATH + ".html",
342+
heading=f"{metaentity_name}-metaentities",
343+
pdf_format=pdf_format,
344+
)
345+
+ gt
346+
)
347+
entity_list = show_entities
348+
316349
for datatype in sorted(file_groups):
350+
group_lines = []
317351
datatype_string = utils._link_with_html(
318352
datatype,
319353
html_path=GLOSSARY_PATH + ".html",
320354
heading=f"{datatype.lower()}-datatypes",
321355
pdf_format=pdf_format,
322356
)
323-
lines.append(f"\t\t{datatype_string}/")
357+
group_lines.append(f"\t\t{datatype_string}/")
324358

325359
# Unique filename patterns
326360
for group in file_groups[datatype]:
327-
ent_string = ""
328-
for ent in schema.rules.entities:
361+
ent_string = start_string
362+
for ent in entity_list:
329363
if ent not in group.entities:
330364
continue
331365

@@ -413,12 +447,18 @@ def make_filename_template(
413447
pdf_format=pdf_format,
414448
)
415449

416-
lines.extend(
450+
group_lines.extend(
417451
f"\t\t\t{ent_string}_{suffix}{extension}"
418452
for suffix in sorted(suffixes)
419453
for extension in sorted(extensions)
420454
)
421455

456+
# If the datatype does not have any files, skip
457+
if not empty_dirs and len(group_lines) == 1:
458+
continue
459+
460+
lines.extend(group_lines)
461+
422462
paragraph = "\n".join(lines)
423463
if pdf_format:
424464
codeblock = f"Template:\n```Text\n{paragraph}\n```"
@@ -459,6 +499,11 @@ def append_filename_template_legend(text, pdf_format=False):
459499
"""
460500

461501
legend = f"""{info_str}
502+
- `<matches>` is a placeholder to denote an arbitrary (and valid) sequence of entities
503+
and labels at the beginning of the filename (only BIDS "raw").
504+
- `<source_entities>` is a placeholder to denote an arbitrary sequence of entities and labels
505+
at the beginning of the filename matching a source file from which the file derives
506+
(only BIDS-Derivatives).
462507
- Filename entities or directories between square brackets
463508
(for example, `[_ses-<label>]`) are OPTIONAL.
464509
- Some entities may only allow specific values,

0 commit comments

Comments
 (0)