-
Notifications
You must be signed in to change notification settings - Fork 133
Different ways to import parameters
There are multiple ways how parameters can be imported from some other model or checkpoint.
This is the most flexible option.
See the scripts tf_avg_checkpoints.py
or tf_inspect_checkpoint.py
as examples.
It should be straightforward to write some own custom logic.
The parameters must match exactly. In a new training (first epoch), instead of random initialization, it would load the given model checkpoint.
In training (task="train"
), if no existing model is found (model specified by model
config option), it would use load
.
In non-training (task!="train"
, e.g. search, forwarding, eval etc), it would use load
.
The param init uses get_initializer
and can in principle use any initializer (e.g. load_txt_file_initializer
), or even custom initializing code, which could import other parameters from a checkpoint or elsewhere.
This is intended to reuse params from other layers, i.e. to share params.
But it can be used to overwrite a custom get_variable
function which can again do arbitrary things, like setting a custom initializers, or defining a variable as a fixed constant, or whatever.
Example:
"layer": {
...,
"reuse_params": {"map": {"W": {"custom": my_custom_variable_creater}}}
}
(This is currently not implemented but planned.)
Via returnn-common, layers can define their own custom name scope, and thus allowing to match some other model checkpoint format.