Skip to content

Allow class_derivations to be a list for multiple target instances #111

@amc-corey-cox

Description

@amc-corey-cox

Summary

Enable class_derivations to be specified as a list, allowing multiple instances of the same target class to be created. This pattern is needed at two levels:

  1. Top-level (TransformationSpecification): Multiple source tables → same target class
  2. Slot-level (SlotDerivation): Single source row → multiple nested objects in a multivalued slot

Use Cases

Use Case 1: Multiple source tables to same target class

Different source tables from a study need to map to the same target class (e.g., condition data from multiple visits all becoming Condition instances).

class_derivations:
  - Condition:
      populated_from: pht004031  # Visit 1 conditions
      slot_derivations: ...
  - Condition:
      populated_from: pht004047  # Visit 2 conditions
      slot_derivations: ...

Currently achieved via external wrapper iteration in dm-bip, but should be natively supported.

Use Case 2: Nested object construction (MeasurementObservationSet)

A single source row needs to produce multiple nested objects. For example, one blood pressure row creates a MeasurementObservationSet containing two MeasurementObservation objects (systolic and diastolic).

class_derivations:
  MeasurementObservationSet:
    populated_from: pht001450
    slot_derivations:
      associated_participant:
        populated_from: phv00098771
      observations:
        class_derivations:
          - MeasurementObservation:
              slot_derivations:
                observation_type:
                  expr: "'OMOP:4154790'"  # Diastolic
                value_quantity:
                  class_derivations:
                    - Quantity:
                        slot_derivations:
                          value_decimal:
                            populated_from: phv00099392
                          unit:
                            expr: "'mm[Hg]'"
          - MeasurementObservation:
              slot_derivations:
                observation_type:
                  expr: "'OMOP:4152194'"  # Systolic
                value_quantity:
                  class_derivations:
                    - Quantity:
                        slot_derivations:
                          value_decimal:
                            populated_from: phv00099391
                          unit:
                            expr: "'mm[Hg]'"

Current Limitations

  • YAML duplicate keys don't work (can't have two MeasurementObservation: keys)
  • object_derivations was created as a workaround but adds unnecessary complexity
  • External iteration wrappers are required for multi-source patterns

Implementation Notes

  • Schema change: class_derivations should accept both dict and list forms
  • Transformer should iterate over list and create instances for each entry
  • Same pattern works at both TransformationSpecification and SlotDerivation levels

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions