Skip to content

Conversation

@Mousius
Copy link
Contributor

@Mousius Mousius commented Jul 23, 2025

  • Adds bgemv T based off of sbgemv T kernel
  • Adds bgemv N which is slightly alterated to not use Y as an accumulator due to the output being bf16 which results in loss of precision
  • Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels

- Adds bgemv T based off of sbgemv T kernel
- Adds bgemv N which is slightly alterated to not use Y as an
accumulator due to the output being bf16 which results in loss of
precision
- Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels
@Mousius Mousius changed the title Optimized BGEMV for NEOVERSEN2, NEOVERSEV1 and NEOVERSEV2 targets Optimized BGEMV for NEOVERSEV1 target Jul 23, 2025
@martin-frbg martin-frbg added this to the 0.3.31 milestone Jul 23, 2025
@martin-frbg martin-frbg merged commit 392d381 into OpenMathLib:develop Jul 23, 2025
81 of 88 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants