Skip to content

Conversation

neuromechanist
Copy link
Member

@neuromechanist neuromechanist commented Jan 30, 2025

The emg-example dataset contains the examples discussed during the biweekly meetings of the BEP042 team @bids-standard/bep042. Each subject is one of the examples from the metadata document.

Here are some key metadata discussed which might be different from other modalities:

  1. channels.tsv optionally includes placement and group columns borrowed from Motion and iEEG, respectively.
  2. channels.tsv also optionally includes signal_electrode column, indicating which electrode(s) was used (in conjunction with the reference) to make the channel.
  3. Multiple coordinate systems can be added to the channels.tsv and electrodes.tsv to adequately describe the channel or electrode placement.
  4. The coordinate systems can be optionally linked together using Anchor electrodes.
  5. A loc_reference column is optionally added to indicate which coordinate system is used for [x,y,z] columns in electrodes.tsv or the placement column in channels.tsv.

We appreciate your input and feedback on the examples. You can comment about specific examples here, on the original issue bids-standard/bids-specification#1371, or if needed the PR bids-standard/bids-specification#1998

@neuromechanist
Copy link
Member Author

The pointer to PR1998 schema is in place now. The dev CI only failing because of the EMGCoordinateSystem and EMGCoordinateUnits for coordsystem.json is pending implementation.

@effigies
Copy link
Contributor

@neuromechanist @drammock I believe the remaining errors are problems in the examples, not shortcomings in the schema/validator.

@neuromechanist
Copy link
Member Author

neuromechanist commented Sep 19, 2025

Thanks much @effigies,
I resolved some issues with the examples:

  1. coordsystem.json SHOULD NOT have any entities except for sub-<id> and optionally space-<label>.
  2. electrodes.tsv is RECOMMENDED to omit unnecessary entities, like task (following EEG).

Remaining is the space-<label> is not yet allowed for coordsystem.json and the values under coordinate_system column under electrodes.tsv are not associated with the <label> of the space-<label>. I think @drammock wanted to add these to bids-standard/bids-specification#1998.

@effigies
Copy link
Contributor

Sorry, I thought I'd pushed my changes to the schema. I've merged with @drammock's recent changes and pushed.

@effigies
Copy link
Contributor

Remaining issues:

  • emg_CustomBipolarFace:
 	[ERROR] EMG_COORD_SYS_MISMATCH Some values in the coordinate_system column of *_electrodes.tsv are not present as
space entities in *_coordsystem.json files.

		/sub-01/emg/sub-01_electrodes.tsv
  • emg_TwoWristbands:
	[ERROR] TSV_ADDITIONAL_COLUMNS_MUST_DEFINE Additional TSV columns must be defined in the associated JSON sidecar for this file type
		placement
		/sub-01/emg/sub-01_task-typing_channels.tsv

	Please visit https://neurostars.org/search?q=TSV_ADDITIONAL_COLUMNS_MUST_DEFINE for existing conversations about this issue.

	[ERROR] JSON_KEY_REQUIRED A JSON flle is missing a key listed as required.
		EMGCoordinateUnits
		/sub-01/emg/sub-01_space-leftForearm_coordsystem.json - Field description: Units of the coordinates of `EMGCoordinateSystem`.

		/sub-01/emg/sub-01_space-rightForearm_coordsystem.json - Field description: Units of the coordinates of `EMGCoordinateSystem`.


	Please visit https://neurostars.org/search?q=JSON_KEY_REQUIRED for existing conversations about this issue.

@neuromechanist
Copy link
Member Author

emg_CustomBipolarFace:

[ERROR] EMG_COORD_SYS_MISMATCH Some values in the coordinate_system column of *_electrodes.tsv are not present as
space entities in *_coordsystem.json files.
		/sub-01/emg/sub-01_electrodes.tsv

Down to one.
This is interesting. The coordsystem.json does not have space-<label> since there is only one file there. electrodes.tsv also does not have a coordinate_system column. I think this should validate. WDYT @drammock?

@drammock
Copy link

I think this should validate. WDYT @drammock?

I agree that it seems valid. I cannot reproduce this failure locally:

$ cd ../validator/
$ git log --oneline -1
79248e84 (HEAD -> feat/add-keys-to-assoc-coordsys, origin/feat/add-keys-to-assoc-coordsys) changelog
$ cd ../examples/
$ git log --oneline -1
14abe5f8 (HEAD -> emg_examples, neuromechanist/emg_examples) fixes to channels.tsv and electrodes.tsv in some datasets? (#3)
$ cd ../spec/
$ git log --oneline -1
a55ff9ef (HEAD -> emg, origin/emg) schema: Check for coordsystem or coordsystems
$ uv run bst export > src/schema.json
$ DATASET=emg_CustomBipolarFace /opt/bids/validator/local-run --schema file:///opt/bids/spec/src/schema.json /opt/bids/examples/$DATASET
        [WARNING] SIDECAR_KEY_RECOMMENDED A data file's JSON sidecar is missing a key listed as recommended.
                StimulusPresentation
                /sub-01/emg/sub-01_task-jumping_events.tsv - Field description: Object containing key-value pairs related to the software used to present
the stimuli during the experiment.


                EpochLength
                /sub-01/emg/sub-01_task-jumping_recording-highDensity_emg.edf - Field description: Duration of individual epochs in seconds (for example, `1`)
in case of epoched data.
If recording was continuous or discontinuous, leave out the field.

                /sub-01/emg/sub-01_task-jumping_recording-bipolar_emg.edf - Field description: Duration of individual epochs in seconds (for example, `1`)
in case of epoched data.
If recording was continuous or discontinuous, leave out the field.


        Please visit https://neurostars.org/search?q=SIDECAR_KEY_RECOMMENDED for existing conversations about this issue.


          Summary:                         Available Tasks:        Available Modalities:
          18 Files, 132 kB                 jumping                 emg                  
          1 - Subjects 1 - Sessions                                                     

        If you have any questions, please post on https://neurostars.org/tags/bids.

@effigies
Copy link
Contributor

Pushed bids-standard/bids-specification@b1fea70, which resolved that issue. LMK if you disagree with the logic and think it needs a different solution.

The only other one I can think of is requiring coordsystem.json to have an identifier (e.g., EMGCoordinateSystemName). Then we would switch to checking that ParentCoordinateSystem matches against those values, and require space-<label> (when present) would have the same value. That is not dramatically more complicated than what is currently there, from a validator perspective.

I also want to draw your attention to https://github.com/bids-standard/bids-validator/pull/268/files#r2365617071:

The EMG examples do not have this case, but with multiple coordsystem.jsons, you could imagine having some at the root and some at the leaf.

We currently strongly assume [...] that the first [directory] level found is the level of the association, which has worked for single-file associations. This probably needs to be relaxed, but the logic will be more intricate. I think you would need to take the first coordsystem.json with a given space entity, not the full collection.

For now this could be noted as a limitation in the validator implementation.

Does this limitation seem acceptable in the short term?

@drammock
Copy link

Pushed bids-standard/bids-specification@b1fea70, which resolved that issue. LMK if you disagree with the logic and think it needs a different solution.

That solution looks good to me.

[...] Does this limitation seem acceptable in the short term?

Is the limitation that you can't collate coordsystem.json files at different directory levels? Or is it that multiple coordsystem.json files with the same space entity can't be collated? The latter seems fine;
I doubt it will come up. The former should be fine in the short term but should be fixed eventually.

@neuromechanist do you agree?

@effigies
Copy link
Contributor

effigies commented Sep 21, 2025

Is the limitation that you can't collate coordsystem.json files at different directory levels? Or is it that multiple coordsystem.json files with the same space entity can't be collated?

The first. Or at least, we don't currently look past the first directory that has responsive files. In the case of sidecars, we get all responsive files, because we need to aggregate their contents. Associations have been just "find the first file".

I suppose an interesting question here is: Should /space-A_coordsystem.json and /sub-01/emg/sub-01_space-A_coordsystem.json have their contents aggregated like a sidecar, or should /sub-01/emg/sub-01_space-A_coordsystem.json take precedence?

@neuromechanist
Copy link
Member Author

I believe that most of the coordsystem.json for EMG should/will live under the dataset root, not at subject level as they don't carry subject/session specific keys by default. This is different from MEG and EEG, where specific fiducial points can be optimally defined in coordsystem.json. In those cases, most examples I saw have this file under session or modality.

So, I think we might need, at least for one example if not most, to move the coordsystem.json files to the root to account for this case.

I suppose an interesting question here is: Should /space-A_coordsystem.json and /sub-01/emg/sub-01_space-A_coordsystem.json have their contents aggregated like a sidecar, or should /sub-01/emg/sub-01_space-A_coordsystem.json take precedence?

Indeed, this is interesting, as this is not formally a sidecar, but hopefully should be treated as a sidecar. IMO, coorsystem.json for ephys is a glorified electrodes.json. It would be great if we merge it into electrodes.json in BIDS2.0, in favor of having one fewer file 😁.

@drammock
Copy link

I suppose an interesting question here is: Should /space-A_coordsystem.json and /sub-01/emg/sub-01_space-A_coordsystem.json have their contents aggregated like a sidecar, or should /sub-01/emg/sub-01_space-A_coordsystem.json take precedence?

I can imagine a case where there's a file at the root that specifies the grid coordinate system, and for a few subjects there's also a subject-specific file that overrides the anchor electrode details (for example if subject anatomy dictated slightly different placement across subjects). Supporting that would be ideal, but in such cases it would be possible (and not that hard) to just put everything in a subject specific file for all subjects. I'm content if the 80/20 rule says that we don't allow the "override" case for now.

@neuromechanist
Copy link
Member Author

neuromechanist commented Sep 21, 2025

I moved the coordsystem.json for the TwoWristbands example to root to cover this use case as well.

All LGTM, thanks to both @drammock, @effigies. Moving out of Draft 🎉.

@neuromechanist neuromechanist marked this pull request as ready for review September 21, 2025 16:28
This commit adds comprehensive EMG-BIDS example datasets demonstrating various EMG recording scenarios including custom bipolar placements, high-density grids, multi-body part recordings, and independent recording modules.

Co-authored-by: Daniel McCloy <[email protected]>
Co-authored-by: Chris Markiewicz <[email protected]>
@effigies
Copy link
Contributor

@neuromechanist BIDS examples should not include full data files, but should be truncated to zero bytes. If header data is needed for some validation rules, then one dataset could have file headers.

If you have real data to use for examples, the best thing would be to host it elsewhere, and have a script for updating the example as any changes are made. The dataset_listing.tsv allows links to the real data to be shared.

(I do see that we have not consistently enforced this, and as a result this repository is now 162MB on a fresh clone.)

@drammock
Copy link

BIDS examples should not include full data files, but should be truncated to zero bytes

That is probably my fault, I filled some files with 1kb random data to make the validator warning go away. @neuromechanist can you fix? I'm away from the office this week.

@neuromechanist
Copy link
Member Author

Well, I changed the original 1kB files to be a little more (around 100kB on average), to have actual EDF file headers and half a second worth of "simulated" data respecting the channel count and sampling frequency.

This has a benefit of the examples being standalone and work properly with tools, rather than linking to adhoc cloud storage.

Happy to move them outside. LMK.

@effigies
Copy link
Contributor

Okay, let's leave them as-is for now.

But please note that these files having contents should generally not be depended on except when necessary to test whether a validator or query tool correctly implements the BIDS spec; tools that use BIDS datasets should maintain their own test data.

We may, in the future, decide to pare down this repository, and it would be ideal not to break anybody's workflow if we did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants