Skip to content

Runtime checks for declare_same_as #1200

@albertz

Description

@albertz

A bit more meta: With all our logic for dim tags, which should actually make it easier to avoid any reshape problems or other shaping problems, why do we still frequently run into such things? The answer is probably too much unnecessary complexity and thus bugs in some parts. But which parts really? What can we remove from it? How can we improve this situation?

Related: rwth-i6/returnn#975

To answer this question: The problem here was that we actually did manual dim math. In LearnedRelativePositionalEncoding, we had:

    out_spatial_dim = spatial_dim - 1 + spatial_dim

...
      remaining_dim = spatial_dim - self.clipping
...
      cond.true, out_spatial_dim_ = nn.concat(
        (left, remaining_dim),
        (self.pos_emb, self.clipped_spatial_dim),
        (right, remaining_dim))
      out_spatial_dim_.declare_same_as(out_spatial_dim)

And:

    self.clipped_spatial_dim = nn.SpatialDim(
      f"{nn.NameCtx.current_ctx().get_abs_name()}:learned-rel-pos",
      dimension=2 * clipping + 1)

I.e.:

out_spatial_dim_
  == spatial_dim - clipping + 2 * clipping + 1 + spatial_dim - clipping
  == 2 * spatial_dim + 1
  != 2 * spatial_dim - 1
  == out_spatial_dim

But the declare_same_as anyway just overwrote this.

Maybe in this case, we could have detected this statically. But in the general case, there are always cases where we can not detect this at compilation time, and only at runtime.

At some point, we planned to actually add such runtime checks for declare_same_as. Maybe this is a reminder. This would have allowed to much more easily detect this problem.

Originally posted by @albertz in rwth-i6/returnn_common#238 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions