-
Notifications
You must be signed in to change notification settings - Fork 266
Description
Describe the bug
SpikeGLX may write data using "folder per probe" organization. Here is the description, from the SpikeGLX User Manual:
Note that there is an option on the
Save TabcalledFolder per probe.
If this is set, there is still a run folder for each g-indexrun-name_gN.
However, inside that there is also a subfolder for each probe that contains
all the t-indices for that g-index and that probe. A probe subfolder is
named likerun-name_gN/run-name_gN_imecM.
For example:
- run_dir/ (example: 3-1-2021/)
- gate_dir_g0/ (example: 3-1-2021_g0/)
...
- gate_dir_gN/
- probe_dir_imec0/ (example: 3-1-2021_g0_imec0/)
...
- probe_dir_imecN/
- trigger_file_gN_imecN_t0.lf.bin
- trigger_file_gN_imecN_t0.lf.meta
- trigger_file_gN_imecN_t0.ap.bin
- trigger_file_gN_imecN_t0.ap.meta
...
- trigger_file_gN_imecN_tN.lf.bin
- trigger_file_gN_imecN_tN.lf.meta
- trigger_file_gN_imecN_tN.ap.bin
- trigger_file_gN_imecN_tN.ap.meta
This issue applies when using the folder-per-probe layout and reading probes with a device index greater than 0. For example, trying to read the following file:
3-1-2021/3-1-2021_g0/3-1-2021_g0_imec1/3-1-2021_g0_imec1.ap.bin
will result in the following error:
AssertionError: stream_id imec1.ap is not in [np.str_('imec0.ap'), np.str_('imec0.lf')]
It should be checking for membership in [np.str_('imec1.ap'), np.str_('imec1.lf')], as that is the only probe in the directory.
The incorrect stream IDs parsed for this directory are generated in neo.rawio.spikeglxrawio.scan_files(), on line 405.
stream_name = f"{device_kind}{device_index}{stream_kind}"
This creates the string {imec}{0}{.ap} because device_index is 0, when it should be 1.
The reason that device_index is 0 is because of lines 388-398 in neo.rawio.spikeglxrawio.scan_files():
# TODO: handle one box case
info_list_imec = [info for info in info_list if info.get("device") != "nidq"]
unique_probe_tuples = {get_probe_tuple(info) for info in info_list_imec}
sorted_probe_keys = sorted(unique_probe_tuples)
probe_tuple_to_probe_index = {key: idx for idx, key in enumerate(sorted_probe_keys)}
for info in info_list:
if info.get("device") == "nidq":
info["device_index"] = "" # TODO: Handle multi nidq case, maybe use meta["typeNiEnabled"]
else:
info["device_index"] = probe_tuple_to_probe_index[get_probe_tuple(info)]
The problem is that you cannot enumerate the sorted probe keys on line 392 and expect that enumeration index to be the correct device index on line 398. In the case of folder per probe organization, where the the probe is imecN, the enumeration index will always be 0, but the device index should be N.
One way to solve this could be to parse the device index directly from the probe name or the info["device"] field, both of which should be "imecN":
info["device_index"] = int(re.split(r"(^[^\d]+)", info["device"])[-1])
If this index cannot be assigned directly to info["device_index"], it could at least be saved and used to offset the probe index from the probe_tuple_to_index dict. Someone with access to a wider variety of SpikeGLX outputs (non-folder-per-probe layouts, OneBox, etc) than I do should double-check!
This bug appears to have been introduced in #1608.
Environment:
- OS: Linux
- Python version
3.13.2 - Neo version
0.14.0 - NumPy version
2.1.3
Additional context
One other thing I noticed while tracking down this bug: On lines 530-534 in neo.rawio.spikeglxrawio.extract_stream_info, device and stream_kind are parsed twice each in quick succession: first from a regex, and then by just splitting the filename. Since they're obtained two different ways, might not hurt to assert that they match. Or just not bother recalculating them the second time, since they're already in hand? Nbd, just something that caught my eye.
Thank you!