Skip to content

Commit 04e0b50

Browse files
authored
Use WASM f32x4 relaxed min/max for relaxed simd build (microsoft#24324)
### Description Use wasm_f32x4_relaxed_max and wasm_f32x4_relaxed_min in WASM relaxed SIMD build. ### Motivation and Context This PR replaces wasm_f32x4_min/max with the relaxed SIMD counterparts wasm_f32x4_relaxed_min/max in WASM relaxed SIMD build. According to [relaxed SIMD proposal](https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#relaxed-min-and-max), the wasm_f32x4_relaxed_min/max allow implementation-defined behavior on NaN propagation and -0.0 vs +0.0. This enables WASM runtimes to use minps/maxps on x64 platforms and improves the performance. e.g. for wasm_f32x4_max -> wasm_f32x4_relaxed_max wasm_f32x4_max: [implementation in V8](https://source.chromium.org/chromium/chromium/src/+/main:v8/src/codegen/shared-ia32-x64/macro-assembler-shared-ia32-x64.cc;l=231) wasm_f32x4_relaxed_max: maxps This change would affect kernel functions rely on MlasMaximumFloat32x4 and MlasMinimumFloat32x4, including various activations and reduced min/max kernels. In mlas micro bench "COMPUTESOFTMAXINPLACE...", this change provides a performance improvement of up to 60% on x64 devices.
1 parent 18f91e5 commit 04e0b50

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

onnxruntime/core/mlas/lib/mlasi.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1455,6 +1455,9 @@ MlasConvDepthwiseFloat_CHW(
14551455
#endif
14561456
#elif defined(MLAS_TARGET_WASM_SIMD)
14571457
#define MLAS_WASM_SIMD_INTRINSICS
1458+
#if defined(MLAS_TARGET_WASM_RELAXED_SIMD)
1459+
#define MLAS_WASM_RELAXED_SIMD_INTRINSICS
1460+
#endif
14581461
#elif defined(MLAS_TARGET_LARCH64)
14591462
#define MLAS_LSX_INTRINSICS
14601463
#endif
@@ -2265,6 +2268,8 @@ MlasMaximumFloat32x4(MLAS_FLOAT32X4 Vector1, MLAS_FLOAT32X4 Vector2)
22652268
#elif defined(MLAS_VSX_INTRINSICS)
22662269
// Don't use vec_max to avoid undefined behavior if NAN
22672270
return vec_sel(Vector2, Vector1, vec_cmpgt(Vector1, Vector2));
2271+
#elif defined(MLAS_WASM_RELAXED_SIMD_INTRINSICS)
2272+
return wasm_f32x4_relaxed_max(Vector1, Vector2);
22682273
#elif defined(MLAS_WASM_SIMD_INTRINSICS)
22692274
return wasm_f32x4_max(Vector1, Vector2);
22702275
#elif defined(MLAS_LSX_INTRINSICS)
@@ -2285,6 +2290,8 @@ MlasMinimumFloat32x4(MLAS_FLOAT32X4 Vector1, MLAS_FLOAT32X4 Vector2)
22852290
#elif defined(MLAS_VSX_INTRINSICS)
22862291
// Don't use vec_min to avoid undefined behavior if NAN
22872292
return vec_sel(Vector2, Vector1, vec_cmpgt(Vector2, Vector1));
2293+
#elif defined(MLAS_WASM_RELAXED_SIMD_INTRINSICS)
2294+
return wasm_f32x4_relaxed_min(Vector1, Vector2);
22882295
#elif defined(MLAS_WASM_SIMD_INTRINSICS)
22892296
return wasm_f32x4_min(Vector1, Vector2);
22902297
#elif defined(MLAS_LSX_INTRINSICS)

0 commit comments

Comments
 (0)