RF native failed to compile / load, inconsistent behavior to pure Python, dividing a tensor of type int

```
PyExtModCompiler call: g++ -shared -O2 -std=c++11 -fno-strict-overflow -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -I /rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/frontend/_native -I /usr/include/python3.12 -fPIC -v -D_GLIBCXX_USE_CXX11_ABI=0 -g /w0/tmp/slurm_az668407.60282320/az668407/returnn_py_ext_mod_cache/_returnn_frontend_native/b20035631a/_returnn_frontend_native.cc -o /w0/tmp/slurm_az668407.60282320/az668407/returnn_py_ext_mod_cache/_returnn_frontend_native/b20035631a/_returnn_frontend_native.so
RETURNN frontend _native backend: Error while getting module:
/lib64/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /w0/tmp/slurm_az668407.60282320/az668407/returnn_py_ext_mod_cache/_returnn_frontend_native/b20035631a/_returnn_frontend_native.so)
This is optional (although very recommended), so we continue without it.
```

So the compilation (or just the load of the native module) fails with `/lib64/libstdc++.so.6: version `GLIBCXX_3.4.30' not found`.

That then causes the following error:
```
...
  File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/torch/frontend/_backend.py", line 1489, in TorchBackend.reduce
    line: correction_factor = rf.masked_fraction_of_shape(axis, inverse=True)
    locals:
      correction_factor = <local> None
      axis = <local> [Dim{B}, Dim{'⌈((-199+time)+-200)/160⌉'[B]}]
  File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/frontend/dims.py", line 283, in masked_fraction_of_shape
    line: return (num_elems_masked / num_elems_total) if not inverse else (num_elems_total / num_elems_masked)
    locals:
      num_elems_masked = <local> Tensor{'reduce_sum', [], dtype='int64'}
      num_elems_total = <local> Tensor{'mul', [], dtype='int32'}
      inverse = <local> True
  File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/tensor/_tensor_op_overloads.py", line 84, in _TensorOpOverloadsMixin.__truediv__
    line: return _rf().combine(self, "/", other)
    locals:
      self = <local> Tensor{'mul', [], dtype='int32'}
      other = <local> Tensor{'reduce_sum', [], dtype='int64'}
  File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/returnn/returnn/frontend/math_.py", line 211, in combine
    line: raise ValueError(
              "Dividing a Tensor of type int by an integer is disallowed. Please convert the Tensor to float."
          )
ValueError: Dividing a Tensor of type int by an integer is disallowed. Please convert the Tensor to float.

...
Module call stack:
(Model.__call__) (root)
(BatchNorm.__call__) feature_batch_norm
(BatchNorm.__call__.<locals>.<lambda>) feature_batch_norm
```

This particular symptom / error was also described in https://github.com/rwth-i6/returnn/pull/1637#issuecomment-2426030386. The issue is that the optimized native RF code behaves different (allows such code) than the pure Python RF code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RF native failed to compile / load, inconsistent behavior to pure Python, dividing a tensor of type int #1749

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RF native failed to compile / load, inconsistent behavior to pure Python, dividing a tensor of type int #1749

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions