Skip to content

resolve confusing state of Processing schema #1651

@tmchartrand

Description

@tmchartrand

We attempted to improve Processing in 2.0, but it has become clear there are many inconsistencies in the current usage that still need to be resolved, likely requiring breaking changes in 3.0:

  • can a helper method for creating a derived asset from an existing record make sure that input_data references in processing correctly refer to that record?
  • should asset references use s3 paths, asset names, or both?
  • how do we distinguish references to assets that are intermediate results in a computation graph (and inputs to later steps) from those that are separate inputs to those later steps?
  • how should we distinguish on-rig or other processing that occurs prior to the "raw" asset from processing that creates a "derived" asset? (and is it necessary to record a "copy data" process from aind_data_transfer?)
  • how should we distinguish processing that updates the metadata only (version upgrades)?
  • do we want to keep the process dependency graph but make it easier to create, instead simplify accessing that information elsewhere (CO API or nextflow files), or accept that input/output information will be incomplete for more complex process workflows?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions