-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
When starting a fresh training the network construction already runs twice because the network apparently differs:
The diff is:
dict diff:
['encoder'] dict diff:
['encoder'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['RelPosSelfAttention._rel_shift'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['RelPosSelfAttention._rel_shift'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['RelPosSelfAttention._rel_shift'] ['subnetwork'] ['reshape'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['RelPosSelfAttention._rel_shift'] ['subnetwork'] ['reshape'] ['extra_deps'] list diff len: len self: 1, len other: 2
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['RelPosSelfAttention._rel_shift'] ['subnetwork'] ['reshape_0'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['RelPosSelfAttention._rel_shift'] ['subnetwork'] ['reshape_0'] ['extra_deps'] list diff len: len self: 1, len other: 2
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['linear_pos'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['linear_pos'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['linear_pos'] ['subnetwork'] ['dot'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['linear_pos'] ['subnetwork'] ['dot'] ['from'] list diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['linear_pos'] ['subnetwork'] ['dot'] ['from'] [0] self: 'base:relative_positional_encoding' != other: 'base:relative_positional_encoding/sin'
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['relative_positional_encoding'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['relative_positional_encoding'] ['subnetwork'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['relative_positional_encoding'] ['subnetwork'] ['output'] dict diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['relative_positional_encoding'] ['subnetwork'] ['output'] ['from'] self: 'sin' != other: 'concat'
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['relative_positional_encoding'] ['subnetwork'] ['output'] ['out_shape'] set diff:
['encoder'] ['subnetwork'] ['layers'] ['subnetwork'] ['0'] ['subnetwork'] ['self_att'] ['subnetwork'] ['relative_positional_encoding'] ['subnetwork'] ['output'] ['out_shape'] Dim{F'conformer-enc-default-out-dim'(512)} not in otherThe network code can be found under:
https://github.com/rwth-i6/i6_experiments/blob/main/users/rossenbach/experiments/librispeech/librispeech_100_attention/rc_conformer_2023/rc_networks/conformer_aed_trial.py
The network is constructed via:
def get_network(epoch, **kwargs):
nn.reset_default_root_name_ctx()
net = construct_network(epoch=epoch, **network_kwargs)
return nn.get_returnn_config().get_net_dict_raw_dict(net)But within the construct_network epoch is not used:
def construct_network(
epoch: int,
audio_features: nn.Data,
bpe_labels: nn.Data,
**kwargs
):
net = ConformerAEDModel(
bpe_size=bpe_labels.sparse_dim,
audio_feature_dim=audio_features.dim_tags[audio_features.feature_dim_axis],
**kwargs
)
out = net(
audio_features=nn.get_extern_data(audio_features),
audio_time=audio_features.dim_tags[audio_features.time_dim_axis],
bpe_labels=nn.get_extern_data(bpe_labels),
bpe_time=bpe_labels.dim_tags[bpe_labels.time_dim_axis]
)
out.mark_as_default_output()
return netThe full log can be found under:
https://gist.github.com/JackTemaki/bc24ac9d5ced81c823a0b94fa0871720
Metadata
Metadata
Assignees
Labels
No labels