-
Notifications
You must be signed in to change notification settings - Fork 78
Closed
Description
Multimodal inputs occur in many settings. They require a more complex structure of summary_variables than numpy arrays, as different inputs usually have different dimensionality, or even number of dimensions. They might also necessitate different summary networks, which are tailored to the specific summary. Manual implementation of this is already possible with the current API, but streamlining it would be great for downstream tasks. To achieve this, we need
- a transform to introduce the additional nesting in the adapter: (e.g.,
fuse(keys, into="summary_variables")which creates a dict with the provided keys and corresponding values inside thesummary_variablesentry). - a
FusionSummaryNetworkwhich takes adictof keys and corresponding summary networks, as well as an additional network that combines their outputs to a chosen summary dimension.
Open questions:
- Do we want to allow for transforms in the second level dict, for example by adding an
adapterargument tofuse, which operates on the inner variables?
stefanradev93 and han-ol
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done