You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use WASM f32x4 relaxed min/max for relaxed simd build (microsoft#24324)
### Description
Use wasm_f32x4_relaxed_max and wasm_f32x4_relaxed_min in WASM relaxed
SIMD build.
### Motivation and Context
This PR replaces wasm_f32x4_min/max with the relaxed SIMD counterparts
wasm_f32x4_relaxed_min/max in WASM relaxed SIMD build.
According to [relaxed SIMD
proposal](https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#relaxed-min-and-max),
the wasm_f32x4_relaxed_min/max allow implementation-defined behavior on
NaN propagation and -0.0 vs +0.0. This enables WASM runtimes to use
minps/maxps on x64 platforms and improves the performance.
e.g. for wasm_f32x4_max -> wasm_f32x4_relaxed_max
wasm_f32x4_max: [implementation in
V8](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/codegen/shared-ia32-x64/macro-assembler-shared-ia32-x64.cc;l=231)
wasm_f32x4_relaxed_max: maxps
This change would affect kernel functions rely on MlasMaximumFloat32x4
and MlasMinimumFloat32x4, including various activations and reduced
min/max kernels. In mlas micro bench "COMPUTESOFTMAXINPLACE...", this
change provides a performance improvement of up to 60% on x64 devices.
0 commit comments