Skip to content

ENH: Array data in .tsv cells #1446

@sappelhoff

Description

@sappelhoff

Potentially to be dealt with:

Discussion originated in this thread:

Currently, cells in BIDS .tsv files may take on numeric or string values (see tabluar data).

Here I list some examples:

  • a string n/a, reflecting a missing value or that a value is not applicable for the present cell
  • a string mystr, that is further specified in the column_name.Levels object in a .json file that accompanies the .tsv file, where column_name refers to the name of the column in the .tsv file under which the string mystr occurs
  • floating point numbers, reflecting time since acquisition for the onset column in events.tsv
    • often associated with a unit, defined in column_name.Units, analogously to the mystr example above

However, for some cases, it would be desirable to define an array of values in a particular .tsv cell. For example when a given event (in a row) in an events.tsv file is associated with multiple channels from a neural recording (e.g., EEG, iEEG, NIRS, ...):

This event at onset X and with duration Y is associated with channels A, B, and C

Currently, there is no formal way to specify such an array in BIDS. We think that it could be helpful for several data modalities (now and in the future) to have such a concept to work with.

In bids-standard/bids-examples#324 (comment), @effigies suggests that we use JSON arrays for this case (see his comment for advantages over more basic structures, like a comma-separated string).

An array is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma).

grafik

I think if we were to adopt JSON arrays as a valid input for .tsv cells, we would would benefit by (at least initially) restricting the kinds of values that may be listed in a given JSON array. Values are defined as such:

A value can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.

For our present purposes, I think it would suffice to confine ourselves to:

  • string
  • number
  • boolean (true/false)

What are your opinions? Do you see other use cases apart from annotations in ephys data (in events.tsv files)? Please comment and advise.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestopinions wantedPlease read and offer your opinion on this matter

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions