Skip to content

Conversation

@Semyon1104
Copy link
Collaborator

No description provided.

}
Shape new_shape(new_dims);

std::vector<float> output_data(input_data->size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, check for other data types

}

if (perm_.empty()) {
perm_.resize(shape.dims());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we enumerate shapes starting from the end?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted the default to be reverse permutation, but it's probably unnecessary.

@allnes
Copy link
Member

allnes commented Aug 14, 2025

Currently, if the layer is constructed with empty perm_, run() fills perm_ once using the first input’s rank. That means calling the same instance later with a different-rank tensor will throw on validate_perm. Prefer computing a local permutation per call when perm_ is empty, e.g.:

void TransposeLayer::run(const Tensor& input, Tensor& output) {
  const auto& shape = input.get_shape();

  std::vector<int64_t> perm = perm_;         // copy member
  if (perm.empty()) {
    perm.resize(shape.dims());
    std::iota(perm.rbegin(), perm.rend(), 0);
  }

  validate_perm(shape /*, perm if you refactor validate_perm to take it */);
  // ... use 'perm' below instead of 'perm_'
}

This keeps the layer usable across inputs of different ranks without surprising statefulness.

@allnes
Copy link
Member

allnes commented Aug 14, 2025

Current mapping recomputes old_indices by repeated %// per element. That’s fine for clarity and likely OK at your tensor sizes, but if you profile a hot path, consider:
• Precomputing input/output strides and computing new_index via a dot with mapped coordinates; and/or
• Special-casing 2D transpose to a cache-friendlier kernel.
Not blockers.

@allnes
Copy link
Member

allnes commented Aug 14, 2025

run assumes float (input.as(), std::vector). If the framework expects multi-dtype tensors, you’ll want either a templated kernel or a small type-dispatch in run. Tests currently cover only float. Not necessarily required for this PR, but worth tracking.

@allnes
Copy link
Member

allnes commented Aug 14, 2025

Reserve capacity for new_dims (new_dims.reserve(shape.dims())).
The out-of-bounds check before writing is defensive; it should be impossible after correct shape math. Keep it or add an assertion instead.

@allnes
Copy link
Member

allnes commented Aug 14, 2025

Tests to add (nice-to-have)
• 1D default permutation (rank-1 with empty perm should be a no-op).
• Multiple runs with same instance & different ranks (catches the mutation issue above).

@aobolensk aobolensk merged commit acba1bb into main Aug 15, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants