Skip to content

Commit b4832ea

Browse files
author
Thinh Nguyen
committed
Merge branch 'main' of https://github.com/datajoint/element-array-ephys into no-curation
2 parents 4951b39 + 088093d commit b4832ea

File tree

7 files changed

+92
-140
lines changed

7 files changed

+92
-140
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
# User data
2+
.DS_Store
3+
14
# Byte-compiled / optimized / DLL files
25
__pycache__/
36
*.py[cod]

README.md

Lines changed: 28 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
# DataJoint Element - Array Electrophysiology Element
2-
DataJoint Element for array electrophysiology.
32

43
This repository features DataJoint pipeline design for extracellular array electrophysiology,
54
with ***Neuropixels*** probe and ***kilosort*** spike sorting method.
@@ -13,12 +12,16 @@ ephys pipeline.
1312

1413
See [Background](Background.md) for the background information and development timeline.
1514

16-
## The Pipeline Architecture
15+
## Element architecture
1716

1817
![element-array-ephys diagram](images/attached_array_ephys_element.svg)
1918

2019
As the diagram depicts, the array ephys element starts immediately downstream from ***Session***,
21-
and also requires some notion of ***Location*** as a dependency for ***InsertionLocation***.
20+
and also requires some notion of ***Location*** as a dependency for ***InsertionLocation***. We
21+
provide an [example workflow](https://github.com/datajoint/workflow-array-ephys/) with a
22+
[pipeline script](https://github.com/datajoint/workflow-array-ephys/blob/main/workflow_array_ephys/pipeline.py)
23+
that models (a) combining this Element with the corresponding [Element-Session](https://github.com/datajoint/element-session)
24+
, and (b) declaring a ***SkullReference*** table to provide Location.
2225

2326
### The design of probe
2427

@@ -45,14 +48,24 @@ This ephys element features automatic ingestion for spike sorting results from t
4548
+ ***WaveformSet*** - A set of spike waveforms for units from a given CuratedClustering
4649

4750
## Installation
48-
```
49-
pip install element-array-ephys
50-
```
5151

52-
If you already have an older version of ***element-array-ephys*** installed using `pip`, upgrade with
53-
```
54-
pip install --upgrade element-array-ephys
55-
```
52+
+ Install `element-array-ephys`
53+
```
54+
pip install element-array-ephys
55+
```
56+
57+
+ Upgrade `element-array-ephys` previously installed with `pip`
58+
```
59+
pip install --upgrade element-array-ephys
60+
```
61+
62+
+ Install `element-interface`
63+
64+
+ `element-interface` is a dependency of `element-array-ephys`, however it is not contained within `requirements.txt`.
65+
66+
```
67+
pip install "element-interface @ git+https://github.com/datajoint/element-interface"
68+
```
5669
5770
## Usage
5871
@@ -65,12 +78,12 @@ To activate the `element-array-ephys`, ones need to provide:
6578
+ schema name for the ephys module
6679
6780
2. Upstream tables
68-
+ Session table
69-
+ SkullReference table (Reference table for InsertionLocation, specifying the skull reference)
81+
+ Session table: A set of keys identifying a recording session (see [Element-Session](https://github.com/datajoint/element-session)).
82+
+ SkullReference table: A reference table for InsertionLocation, specifying the skull reference (see [example pipeline](https://github.com/datajoint/workflow-array-ephys/blob/main/workflow_array_ephys/pipeline.py)).
7083
71-
3. Utility functions
72-
+ get_ephys_root_data_dir()
73-
+ get_session_directory()
84+
3. Utility functions. See [example definitions here](https://github.com/datajoint/workflow-array-ephys/blob/main/workflow_array_ephys/paths.py)
85+
+ get_ephys_root_data_dir(): Returns your root data directory.
86+
+ get_session_directory(): Returns the path of the session data relative to the root.
7487
7588
For more detail, check the docstring of the `element-array-ephys`:
7689

element_array_ephys/__init__.py

Lines changed: 1 addition & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
11
import datajoint as dj
2-
import pathlib
3-
import uuid
4-
import hashlib
52
import logging
63
import os
74

@@ -12,74 +9,4 @@
129
def get_logger(name):
1310
log = logging.getLogger(name)
1411
log.setLevel(os.getenv('LOGLEVEL', 'INFO'))
15-
return log
16-
17-
18-
def find_full_path(root_directories, relative_path):
19-
"""
20-
Given a relative path, search and return the full-path
21-
from provided potential root directories (in the given order)
22-
:param root_directories: potential root directories
23-
:param relative_path: the relative path to find the valid root directory
24-
:return: full-path (pathlib.Path object)
25-
"""
26-
relative_path = _to_Path(relative_path)
27-
28-
if relative_path.exists():
29-
return relative_path
30-
31-
# turn to list if only a single root directory is provided
32-
if isinstance(root_directories, (str, pathlib.Path)):
33-
root_directories = [_to_Path(root_directories)]
34-
35-
for root_dir in root_directories:
36-
if (_to_Path(root_dir) / relative_path).exists():
37-
return _to_Path(root_dir) / relative_path
38-
39-
raise FileNotFoundError('No valid full-path found (from {})'
40-
' for {}'.format(root_directories, relative_path))
41-
42-
43-
def find_root_directory(root_directories, full_path):
44-
"""
45-
Given multiple potential root directories and a full-path,
46-
search and return one directory that is the parent of the given path
47-
:param root_directories: potential root directories
48-
:param full_path: the full path to search the root directory
49-
:return: root_directory (pathlib.Path object)
50-
"""
51-
full_path = _to_Path(full_path)
52-
53-
if not full_path.exists():
54-
raise FileNotFoundError(f'{full_path} does not exist!')
55-
56-
# turn to list if only a single root directory is provided
57-
if isinstance(root_directories, (str, pathlib.Path)):
58-
root_directories = [_to_Path(root_directories)]
59-
60-
try:
61-
return next(_to_Path(root_dir) for root_dir in root_directories
62-
if _to_Path(root_dir) in set(full_path.parents))
63-
64-
except StopIteration:
65-
raise FileNotFoundError('No valid root directory found (from {})'
66-
' for {}'.format(root_directories, full_path))
67-
68-
69-
def _to_Path(path):
70-
"""
71-
Convert the input "path" into a pathlib.Path object
72-
Handles one odd Windows/Linux incompatibility of the "\\"
73-
"""
74-
return pathlib.Path(str(path).replace('\\', '/'))
75-
76-
77-
def dict_to_uuid(key):
78-
"""
79-
Given a dictionary `key`, returns a hash string as UUID
80-
"""
81-
hashed = hashlib.md5()
82-
for k, v in sorted(key.items()):
83-
hashed.update(str(k).encode())
84-
hashed.update(str(v).encode())
85-
return uuid.UUID(hex=hashed.hexdigest())
12+
return log

element_array_ephys/ephys.py

Lines changed: 31 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,11 @@
66
import importlib
77
from decimal import Decimal
88

9+
from element_interface.utils import find_root_directory, find_full_path, dict_to_uuid
10+
911
from .readers import spikeglx, kilosort, openephys
10-
from . import probe, find_full_path, find_root_directory, dict_to_uuid, get_logger
12+
from . import probe, get_logger
13+
1114

1215
log = get_logger(__name__)
1316

@@ -52,7 +55,6 @@ def activate(ephys_schema_name, probe_schema_name=None, *, create_schema=True,
5255
global _linking_module
5356
_linking_module = linking_module
5457

55-
# activate
5658
probe.activate(probe_schema_name, create_schema=create_schema,
5759
create_tables=create_tables)
5860
schema.activate(ephys_schema_name, create_schema=create_schema,
@@ -63,9 +65,10 @@ def activate(ephys_schema_name, probe_schema_name=None, *, create_schema=True,
6365

6466
def get_ephys_root_data_dir() -> list:
6567
"""
66-
All data paths, directories in DataJoint Elements are recommended to be stored as
67-
relative paths, with respect to some user-configured "root" directory,
68-
which varies from machine to machine (e.g. different mounted drive locations)
68+
All data paths, directories in DataJoint Elements are recommended to be
69+
stored as relative paths, with respect to some user-configured "root"
70+
directory, which varies from machine to machine (e.g. different mounted
71+
drive locations)
6972
7073
get_ephys_root_data_dir() -> list
7174
This user-provided function retrieves the possible root data directories
@@ -91,7 +94,7 @@ def get_session_directory(session_key: dict) -> str:
9194
Retrieve the session directory containing the
9295
recorded Neuropixels data for a given Session
9396
:param session_key: a dictionary of one Session `key`
94-
:return: a string for full path to the session directory
97+
:return: a string for relative or full path to the session directory
9598
"""
9699
return _linking_module.get_session_directory(session_key)
97100

@@ -224,21 +227,22 @@ class EphysFile(dj.Part):
224227
"""
225228

226229
def make(self, key):
227-
sess_dir = find_full_path(get_ephys_root_data_dir(),
230+
session_dir = find_full_path(get_ephys_root_data_dir(),
228231
get_session_directory(key))
229232
inserted_probe_serial_number = (ProbeInsertion * probe.Probe & key).fetch1('probe')
230233

231234
# search session dir and determine acquisition software
232235
for ephys_pattern, ephys_acq_type in zip(['*.ap.meta', '*.oebin'],
233236
['SpikeGLX', 'Open Ephys']):
234-
ephys_meta_filepaths = list(sess_dir.rglob(ephys_pattern))
237+
ephys_meta_filepaths = list(session_dir.rglob(ephys_pattern))
235238
if ephys_meta_filepaths:
236239
acq_software = ephys_acq_type
237240
break
238241
else:
239242
raise FileNotFoundError(
240243
f'Ephys recording data not found!'
241-
f' Neither SpikeGLX nor Open Ephys recording files found')
244+
f' Neither SpikeGLX nor Open Ephys recording files found'
245+
f' in {session_dir}')
242246

243247
supported_probe_types = probe.ProbeType.fetch('probe_type')
244248

@@ -277,12 +281,13 @@ def make(self, key):
277281
'recording_duration': (spikeglx_meta.recording_duration
278282
or spikeglx.retrieve_recording_duration(meta_filepath))})
279283

280-
root_dir = find_root_directory(get_ephys_root_data_dir(), meta_filepath)
284+
root_dir = find_root_directory(get_ephys_root_data_dir(),
285+
meta_filepath)
281286
self.EphysFile.insert1({
282287
**key,
283288
'file_path': meta_filepath.relative_to(root_dir).as_posix()})
284289
elif acq_software == 'Open Ephys':
285-
dataset = openephys.OpenEphys(sess_dir)
290+
dataset = openephys.OpenEphys(session_dir)
286291
for serial_number, probe_data in dataset.probes.items():
287292
if str(serial_number) == inserted_probe_serial_number:
288293
break
@@ -313,8 +318,7 @@ def make(self, key):
313318
'recording_datetime': probe_data.recording_info['recording_datetimes'][0],
314319
'recording_duration': np.sum(probe_data.recording_info['recording_durations'])})
315320

316-
root_dir = find_root_directory(
317-
get_ephys_root_data_dir(),
321+
root_dir = find_root_directory(get_ephys_root_data_dir(),
318322
probe_data.recording_info['recording_files'][0])
319323
self.EphysFile.insert([{**key,
320324
'file_path': fp.relative_to(root_dir).as_posix()}
@@ -661,16 +665,16 @@ class Curation(dj.Manual):
661665
curation_id: int
662666
---
663667
curation_time: datetime # time of generation of this set of curated clustering results
664-
curation_output_dir: varchar(255) # output directory of the curated results, relative to clustering root data directory
668+
curation_output_dir: varchar(255) # output directory of the curated results, relative to root data directory
665669
quality_control: bool # has this clustering result undergone quality control?
666670
manual_curation: bool # has manual curation been performed on this clustering result?
667671
curation_note='': varchar(2000)
668672
"""
669673

670674
def create1_from_clustering_task(self, key, curation_note=''):
671675
"""
672-
A convenient function to create a new corresponding "Curation"
673-
for a particular "ClusteringTask"
676+
A function to create a new corresponding "Curation" for a particular
677+
"ClusteringTask"
674678
"""
675679
if key not in Clustering():
676680
raise ValueError(f'No corresponding entry in Clustering available'
@@ -684,8 +688,10 @@ def create1_from_clustering_task(self, key, curation_note=''):
684688
# Synthesize curation_id
685689
curation_id = dj.U().aggr(self & key, n='ifnull(max(curation_id)+1,1)').fetch1('n')
686690
self.insert1({**key, 'curation_id': curation_id,
687-
'curation_time': creation_time, 'curation_output_dir': output_dir,
688-
'quality_control': is_qc, 'manual_curation': is_curated,
691+
'curation_time': creation_time,
692+
'curation_output_dir': output_dir,
693+
'quality_control': is_qc,
694+
'manual_curation': is_curated,
689695
'curation_note': curation_note})
690696

691697

@@ -835,9 +841,9 @@ def yield_unit_waveforms():
835841
spikeglx_meta_filepath = get_spikeglx_meta_filepath(key)
836842
neuropixels_recording = spikeglx.SpikeGLX(spikeglx_meta_filepath.parent)
837843
elif acq_software == 'Open Ephys':
838-
sess_dir = find_full_path(get_ephys_root_data_dir(),
839-
get_session_directory(key))
840-
openephys_dataset = openephys.OpenEphys(sess_dir)
844+
session_dir = find_full_path(get_ephys_root_data_dir(),
845+
get_session_directory(key))
846+
openephys_dataset = openephys.OpenEphys(session_dir)
841847
neuropixels_recording = openephys_dataset.probes[probe_serial_number]
842848

843849
def yield_unit_waveforms():
@@ -884,12 +890,13 @@ def get_spikeglx_meta_filepath(ephys_recording_key):
884890
except FileNotFoundError:
885891
# if not found, search in session_dir again
886892
if not spikeglx_meta_filepath.exists():
887-
sess_dir = find_full_path(get_ephys_root_data_dir(),
888-
get_session_directory(ephys_recording_key))
893+
session_dir = find_full_path(get_ephys_root_data_dir(),
894+
get_session_directory(
895+
ephys_recording_key))
889896
inserted_probe_serial_number = (ProbeInsertion * probe.Probe
890897
& ephys_recording_key).fetch1('probe')
891898

892-
spikeglx_meta_filepaths = [fp for fp in sess_dir.rglob('*.ap.meta')]
899+
spikeglx_meta_filepaths = [fp for fp in session_dir.rglob('*.ap.meta')]
893900
for meta_filepath in spikeglx_meta_filepaths:
894901
spikeglx_meta = spikeglx.SpikeGLXMeta(meta_filepath)
895902
if str(spikeglx_meta.probe_SN) == inserted_probe_serial_number:

0 commit comments

Comments
 (0)