Upgrade ephys pipeline to `aind-data-schema` 2.0

@dbirman @arjunsridhar12345 @dyf this is to summariza and track progress on upgrading the ephys pipeline to `aind-data-schema` 2.0

## Required changes

- [x] Update `DataProcess` instantiation in Preprocessing, Spike Sorting, Postprocessing, Curation, and Visualization steps (e.g. [here](https://github.com/AllenNeuralDynamics/aind-ephys-preprocessing/blob/main/code/run_capsule.py#L631-L642)) (see [preprocessing](https://github.com/AllenNeuralDynamics/aind-ephys-preprocessing/pull/20) , [spike sorting](https://github.com/AllenNeuralDynamics/aind-ephys-spikesort-kilosort4/pull/8), [postprocessing](https://github.com/AllenNeuralDynamics/aind-ephys-postprocessing/pull/9), [curation](https://github.com/AllenNeuralDynamics/aind-ephys-curation/pull/10), [visualization](https://github.com/AllenNeuralDynamics/aind-ephys-visualization/pull/16))
- [x] Update [Collect Results](https://github.com/AllenNeuralDynamics/aind-ephys-results-collector/) capsule to upgrade to/generate 2.0 compliant JSON files ([here](https://github.com/AllenNeuralDynamics/aind-ephys-results-collector/blob/main/code/run_capsule.py#L366-L461))
  - https://github.com/AllenNeuralDynamics/aind-ephys-results-collector/pull/14
- [x] Update parsing of session/rig to instantiate NWB Devices in [NWB Ecephys](https://github.com/AllenNeuralDynamics/aind-ecephys-nwb) and [NWB Units](https://github.com/AllenNeuralDynamics/aind-units-nwb)
  - https://github.com/AllenNeuralDynamics/aind-nwb-utils/pull/25
- [x] Update [QC](https://github.com/AllenNeuralDynamics/aind-ephys-processing-qc) and [QC collector](https://github.com/AllenNeuralDynamics/aind-ephys-qc-collector) capsules to generate 2.0 compliant quality_control.json files
  - https://github.com/AllenNeuralDynamics/aind-ephys-processing-qc/pull/15 
  - https://github.com/AllenNeuralDynamics/aind-ephys-qc-collector/pull/1

## Issues

### Collect Results

The [Collect Results](https://github.com/AllenNeuralDynamics/aind-ephys-results-collector/) uses existing `processing.json` and `data_description.json` from the input asset. If these are not 2.0, we need a way to convert them to 2.0 first, so that the pipeline is compatible with existing data assets. We could use the `aind-metadata-upgrader` when it can [upgrade to 2.0](https://github.com/AllenNeuralDynamics/aind-metadata-upgrader/pull/73).

To give more context, the `processing.json` is created by the `aind-data-transfer` and it logs the compression data process. The current behavior of the pipeline is to extend on that `Processing` object and add the ephys-generated processes. The `data_description.json`, instead, is used to instantiate the `DerivedDataDescription` (which will be replaced by [`DataDescription.from_raw`](https://github.com/AllenNeuralDynamics/aind-data-schema/blob/dev/src/aind_data_schema/core/data_description.py#L172)). In both cases, 2.0 processing/data description are needed.

For the `processing.json`, we could actually create a new object which just logs the processing steps of the ephys pipeline (since it ends up in the result asset anyways...).
For the `data_description.json`, I think it would be best to use the `from_raw`, so that all metadata are propagated (e.g., funder, investigators, etc)

### NWB

Currently, there is a [function](https://github.com/AllenNeuralDynamics/aind-ecephys-nwb/blob/main/code/utils.py#L10-L174) that parses session+rig info to instantiate NWB Devices (this function should be moved to `aind-nwb-utils` @arjunsridhar12345, since currently both the Ecephys and Units capsules have their own copy...). It doesn't currently use `aind-data-schema`, but parses directly the JSON files. We'll definitely need to update this function, which could support both 1.0/2.0 session/rig schemas to make it compatible with all data assets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Upgrade ephys pipeline to `aind-data-schema` 2.0 #53

Required changes

Issues

Collect Results

NWB

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Upgrade ephys pipeline to aind-data-schema 2.0 #53

Description

Required changes

Issues

Collect Results

NWB

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Upgrade ephys pipeline to `aind-data-schema` 2.0 #53