Skip to content

Allow __init__ logic to work equally for graph-based and eager-based backends, specifically re-parameterization like weight norm #250

@albertz

Description

@albertz

We want to support multiple backends for nn, such as RETURNN, TensorFlow and PyTorch (see rwth-i6/returnn#1264).
This implies, we need to design our API in a way that it works both with eager-mode and graph-mode backends.

This issue here is via the comment rwth-i6/returnn#1264.

It's best explained on the weight norm (#91) example, or more generally any re-parameterization. The current weight norm code only makes sense for graph mode, as it symbolically redefines the parameter in terms of some other symbolic formula. This formula thus needs to be evaluated again and again, everytime some computation is done with the model. This naturally works with graph-mode. For eager-mode, this must be more explicit, done when the actual parameter is accessed, as the parameter is not a symbolic formula.

Note this is different for other code in __call__. This code should work no matter if it is executed in graph-mode or eager-mode. And any control flow logic is already wrapped.

However, in __init__, there is an important difference. In each case, this is executed only once. With symbolic computation, represententing some value e.g. based on a parameter, for example weight normalized parameters, this is totally fine and the right thing to do for symbolic execution. However, in case of eager execution, only executing it once is not helpful. E.g. in PyTorch, weight normalization will use _forward_pre_hooks to calculate it again and again.

So far we only defined parameters in __init__, and maybe their initial values (nn.init.ParamInit) or maybe things like weight decay. This is fine for both eager and symbolic mode.

However, for any computation depending on a parameter which can potentially change, we need to think about this. It's not clear yet how to solve this. This becomes relevant for example for weight norm (#91).

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions