Skip to content

Commit a8de39d

Browse files
committed
feat(schema): Require participants.tsv to be a superset of sub_dirs/participants
1 parent 2f9b5f6 commit a8de39d

File tree

1 file changed

+11
-3
lines changed

1 file changed

+11
-3
lines changed

src/schema/rules/checks/dataset.yaml

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,11 @@ ParticipantIDMismatch:
2424
selectors:
2525
- path == '/participants.tsv'
2626
checks:
27-
- allequal(sorted(columns.participant_id), sorted(dataset.subjects.sub_dirs))
27+
- |
28+
allequal(
29+
sorted(intersects(columns.participant_id, dataset.subjects.sub_dirs)),
30+
sorted(dataset.subjects.sub_dirs)
31+
)
2832
2933
# 51
3034
PhenotypeSubjectsMissing:
@@ -34,10 +38,14 @@ PhenotypeSubjectsMissing:
3438
A phenotype/ .tsv file lists subjects that were not found in the dataset.
3539
level: error
3640
selectors:
37-
- path == '/dataset_description.json'
41+
- path == '/participants.tsv'
3842
- type(dataset.subjects.phenotype) != 'null'
3943
checks:
40-
- allequal(sorted(dataset.subjects.phenotype), sorted(dataset.subjects.sub_dirs))
44+
- |
45+
allequal(
46+
sorted(intersects(columns.participant_id, dataset.subjects.phenotype)),
47+
sorted(dataset.subjects.phenotype)
48+
)
4149
4250
# 214
4351
SamplesTSVMissing:

0 commit comments

Comments
 (0)