[CUDA EP] Add hardswish op and add bf16 support for hardsigmoid by Stonesjtu · Pull Request #25562 · microsoft/onnxruntime

Stonesjtu · 2025-07-28T14:21:13Z

Description

Add HardSwish operator which is x*HardSigmoid(x)
Add bf16 support for HardSigmoid

Motivation and Context

HardSwish is implemented as HardSidmoid + Add in CUDA EP currently.
A fused HardSwish should take half the time of HardSigmoid + Add.

Stonesjtu · 2025-07-28T14:44:14Z

@microsoft-github-policy-service agree

Stonesjtu · 2025-07-29T07:38:27Z

Can anyone help triggering the CI?
@jywu-msft can you review this PR or assign the responsible reviewers?

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Stonesjtu · 2025-07-31T09:58:37Z

@justinchuby The new tests regarding HardSwish pass locally. Can you trigger the CI again?

linking the fusion pass: microsoft/onnxscript#2472

Stonesjtu · 2025-08-04T02:18:41Z

The CI failed for OpenVINO & CoreML(arm64) & Android-NN-API, which should be irrelevant to this PR. I disabled the HardSwish tests for non-cuda EPs.

Stonesjtu · 2025-08-11T12:56:04Z

@justinchuby CI should pass, can you review this PR?

justinchuby · 2025-08-11T15:08:33Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-11T15:08:59Z

Azure Pipelines successfully started running 5 pipeline(s).

Stonesjtu · 2025-08-14T05:15:02Z

@justinchuby plz trigger the CI

justinchuby · 2025-08-14T15:05:32Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-14T15:05:54Z

Azure Pipelines successfully started running 5 pipeline(s).

justinchuby · 2025-08-16T03:56:48Z

@Stonesjtu could you fix the documentation according to https://aiinfra.visualstudio.com/PublicPackages/_build/results?buildId=908666&view=logs&j=7f366e99-16b2-52cc-e1ff-653af284e397&t=834305f1-2220-521d-a5bb-dfba0f922108&l=5484 ?

Stonesjtu · 2025-08-16T10:31:40Z

@justinchuby Thanks. Doc is updated as shown in the Azure CI.

justinchuby · 2025-08-16T23:56:22Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-16T23:56:41Z

Azure Pipelines successfully started running 5 pipeline(s).

onnxruntime/core/providers/cuda/activation/activations.cc

Copilot

Pull Request Overview

This PR adds support for the HardSwish operator and extends bf16 (BFloat16) support for HardSigmoid in the CUDA execution provider. The motivation is to provide a fused HardSwish implementation that should be twice as fast as the current approach of using HardSigmoid + Add.

Adds HardSwish operator implementation with support for float, double, MLFloat16, and BFloat16 types
Extends HardSigmoid operator to support BFloat16 data type
Updates versioning for both operators to support opset 22 with the new BFloat16 support

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
onnxruntime/test/providers/cpu/activation/activation_op_test.cc	Adds unit tests for HardSwish operator
onnxruntime/core/providers/cuda/cuda_execution_provider.cc	Registers HardSwish and updated HardSigmoid kernels with proper versioning
onnxruntime/core/providers/cuda/activation/activations_impl.h	Adds HardSwish to activation operations list
onnxruntime/core/providers/cuda/activation/activations_impl.cu	Implements HardSwish CUDA kernel function
onnxruntime/core/providers/cuda/activation/activations.h	Declares HardSwish class template
onnxruntime/core/providers/cuda/activation/activations.cc	Defines HardSwish operator registration macros
docs/OperatorKernels.md	Updates documentation for HardSwish and HardSigmoid operator support

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

onnxruntime/core/providers/cuda/activation/activations.cc

justinchuby · 2025-08-21T03:30:47Z

@Stonesjtu could you merge from main? Sorry for the inconvenience but we need the latest change to unblock the iphone simulator pipeline.

justinchuby · 2025-08-21T14:26:45Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-21T14:27:08Z

Azure Pipelines successfully started running 5 pipeline(s).

…osoft#25562) ### Description  Add HardSwish operator which is x*HardSigmoid(x) Add bf16 support for HardSigmoid ### Motivation and Context  HardSwish is implemented as HardSidmoid + Add in CUDA EP currently. A fused HardSwish should take half the time of HardSigmoid + Add. --------- Co-authored-by: kaiyu <kaiyu@bytedance.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

### Description  Add HardSwish operator which is x*HardSigmoid(x) Add bf16 support for HardSigmoid ### Motivation and Context  HardSwish is implemented as HardSidmoid + Add in CUDA EP currently. A fused HardSwish should take half the time of HardSigmoid + Add. --------- Co-authored-by: kaiyu <kaiyu@bytedance.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

kaiyu added 4 commits July 28, 2025 09:35

Add HardSwish implementation for CUDA EP

dff6a63

Add operator tests

d015495

Refine ending versions

2e0e2c2

Add unit tests

b5670e7

Stonesjtu changed the title ~~Add hardswish op for CUDA EP~~ [CUDA EP] Add hardswish op and add bf16 support for harsigmoid Jul 30, 2025

justinchuby added the core runtime issues related to core runtime label Jul 30, 2025

justinchuby requested a review from Copilot July 30, 2025 15:41

This comment was marked as outdated.

Sign in to view

Stonesjtu and others added 3 commits July 31, 2025 10:36

Update onnxruntime/test/providers/cpu/activation/activation_op_test.cc

3aa8189

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestions from code review

1529e41

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Remove fp16 and bf16 tests due to inf/nan diff

d2aa269

Stonesjtu changed the title ~~[CUDA EP] Add hardswish op and add bf16 support for harsigmoid~~ [CUDA EP] Add hardswish op and add bf16 support for hardsigmoid Jul 31, 2025

kaiyu added 2 commits August 1, 2025 03:25

Make linter happy

fb4fb2d

Eanble HardSwish test only for CUDA Build

677d55b

justinchuby previously approved these changes Aug 11, 2025

View reviewed changes

Revert the unnecessary operator kernel doc

3968359

Stonesjtu dismissed justinchuby’s stale review via 3968359 August 12, 2025 02:47

Stonesjtu requested a review from justinchuby August 12, 2025 02:48

Update operator docs

55fd709

justinchuby reviewed Aug 17, 2025

View reviewed changes

onnxruntime/core/providers/cuda/activation/activations.cc Show resolved Hide resolved

justinchuby approved these changes Aug 18, 2025

View reviewed changes

justinchuby requested review from Copilot and tianleiwu August 18, 2025 15:37

Copilot AI reviewed Aug 18, 2025

View reviewed changes

onnxruntime/core/providers/cuda/activation/activations.cc Show resolved Hide resolved

Merge branch 'main' into cuda-hardswish

8d16632

justinchuby merged commit 21404e3 into microsoft:main Aug 21, 2025
86 checks passed

Stonesjtu deleted the cuda-hardswish branch October 21, 2025 06:19

Conversation

Stonesjtu commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

Stonesjtu commented Jul 28, 2025

Uh oh!

Stonesjtu commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Stonesjtu commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stonesjtu commented Aug 4, 2025

Uh oh!

Stonesjtu commented Aug 11, 2025

Uh oh!

justinchuby commented Aug 11, 2025

Uh oh!

azure-pipelines bot commented Aug 11, 2025

Uh oh!

Stonesjtu commented Aug 14, 2025

Uh oh!

justinchuby commented Aug 14, 2025

Uh oh!

azure-pipelines bot commented Aug 14, 2025

Uh oh!

justinchuby commented Aug 16, 2025

Uh oh!

Stonesjtu commented Aug 16, 2025

Uh oh!

justinchuby commented Aug 16, 2025

Uh oh!

azure-pipelines bot commented Aug 16, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

justinchuby commented Aug 21, 2025

Uh oh!

justinchuby commented Aug 21, 2025

Uh oh!

azure-pipelines bot commented Aug 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Stonesjtu commented Jul 28, 2025 •

edited

Loading

Stonesjtu commented Jul 29, 2025 •

edited

Loading

Stonesjtu commented Jul 31, 2025 •

edited

Loading