Skip to content

Conversation

@kimm240
Copy link
Contributor

@kimm240 kimm240 commented Nov 27, 2025

Currently, the FuseReductionEpilogue primitive only supports Bias
(addition) and BiasReLU (addition + ReLU) epilogue patterns. However,
clipping operations (min(max(x, lower), upper)) are commonly used in
deep learning models and would benefit from the same fusion optimization.

This commit extends FuseReductionEpilogue to support Clipping patterns
by:

  1. Adding EpilogueType::Clipping to the enum to distinguish clipping
    patterns from other epilogue types.

  2. Adding clipping_lower_ and clipping_upper_ members to
    ReductionEpilogueFuser to store clipping bounds extracted from the
    epilogue pattern.

  3. Extending AnalyzeEpiloguePattern to detect clipping patterns:

    • min(max(temp, lower), upper)
    • max(min(temp, upper), lower)
    • All commutative variants of min/max at each level
  4. Updating BiasReLU pattern matching to handle max(0, x) form in
    addition to max(x, 0) for better commutativity support.

  5. Modifying CreateFusedReductionBlock to apply clipping to the init
    value: init = min(max(0, lower), upper)

  6. Updating BufferReplacer to apply clipping per-iteration:
    value = min(max(value, lower), upper)

  7. Adding validation in BodyPatternAllowFusion to ensure temp appears
    exactly once in clipping patterns.

  8. Creating comprehensive test coverage with 8 test cases:

    • Basic fusion test
    • Numerical correctness verification
    • Multiple epilogue blocks test
    • 5 commutative variant tests

This implementation follows the same per-iteration semantics as BiasReLU,
where clipping is applied at each reduction step rather than
post-reduction. This semantic change is documented in the docstring with
a warning about potential numerical differences.

The test suite verifies that all commutative forms of clipping patterns
are correctly recognized and that the fused implementation produces
numerically identical results to the per-iteration reference
implementation.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kimm240, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the FuseReductionEpilogue primitive by introducing support for Clipping operations. Previously limited to Bias and BiasReLU, this change expands the optimization's applicability to common deep learning patterns involving value clamping. A key aspect of this update is the adoption of per-iteration semantics for non-linear epilogues, which is a deliberate design choice with documented implications for numerical results. The implementation includes robust pattern matching for commutative variants and comprehensive testing to ensure correctness and stability.

Highlights

  • Clipping Pattern Support: The FuseReductionEpilogue primitive now supports Clipping patterns, specifically min(max(x, lower), upper) and its commutative variants, allowing for fusion of clipping operations into reduction blocks.
  • Per-Iteration Semantics for Non-Linear Epilogues: For non-linear epilogues like BiasReLU and the newly added Clipping, the fusion changes the computation semantics from post-reduction application to per-iteration application. This is explicitly documented and verified by tests, acknowledging potential numerical differences.
  • Enhanced Pattern Matching: The AnalyzeEpiloguePattern function has been extended to detect all commutative variants of min/max for clipping and to handle max(0, x) forms for BiasReLU, improving robustness.
  • Updated Reduction Block Creation: The CreateFusedReductionBlock and BufferReplacer logic have been updated to correctly apply the BiasReLU and Clipping operations to the initial value and per-iteration updates within the fused reduction block.
  • Comprehensive Test Coverage: New test cases have been added to validate the Clipping pattern fusion, including basic fusion, numerical correctness (matching per-iteration reference), handling of multiple epilogue blocks, and recognition of various commutative forms.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a great pull request that extends FuseReductionEpilogue to support clipping patterns, a common and important operation in deep learning models. The implementation is clean, well-structured, and includes comprehensive test coverage for correctness and various commutative patterns. The documentation is also thoughtfully updated with a clear warning about the semantic change for non-linear epilogues, which is crucial for users. I have a couple of suggestions to make the fusion logic even more robust and general.

hyun gyu kim added 3 commits November 27, 2025 11:07
The FuseReductionEpilogue primitive currently supports fusing bias addition
epilogues into reduction blocks. This commit extends the primitive to also
support ReLU activation functions in epilogue blocks, enabling fusion of
patterns like max(temp + bias, 0) into the reduction computation.

The implementation adds an EpilogueType enumeration to distinguish between
Bias and BiasReLU patterns. The AnalyzeEpiloguePattern method is extended
to detect ReLU patterns by checking for MaxNode expressions with zero
constants.

This commit also adds comprehensive tests in
test_tir_schedule_fuse_reduction_epilogue_relu.py, following the same
patterns as the existing bias tests. The tests verify structural equality,
numerical correctness with per-iteration ReLU semantics, and multiple
epilogue block scenarios. All tests pass successfully.
Currently, the FuseReductionEpilogue primitive only supports Bias
(addition) and BiasReLU (addition + ReLU) epilogue patterns. However,
clipping operations (min(max(x, lower), upper)) are commonly used in
deep learning models and would benefit from the same fusion optimization.

This commit extends FuseReductionEpilogue to support Clipping patterns
by:

1. Adding EpilogueType::Clipping to the enum to distinguish clipping
   patterns from other epilogue types.

2. Adding clipping_lower_ and clipping_upper_ members to
   ReductionEpilogueFuser to store clipping bounds extracted from the
   epilogue pattern.

3. Extending AnalyzeEpiloguePattern to detect clipping patterns:
   - min(max(temp, lower), upper)
   - max(min(temp, upper), lower)
   - All commutative variants of min/max at each level

4. Updating BiasReLU pattern matching to handle max(0, x) form in
   addition to max(x, 0) for better commutativity support.

5. Modifying CreateFusedReductionBlock to apply clipping to the init
   value: init = min(max(0, lower), upper)

6. Updating BufferReplacer to apply clipping per-iteration:
   value = min(max(value, lower), upper)

7. Adding validation in BodyPatternAllowFusion to ensure temp appears
   exactly once in clipping patterns.

8. Creating comprehensive test coverage with 8 test cases:
   - Basic fusion test
   - Numerical correctness verification
   - Multiple epilogue blocks test
   - 5 commutative variant tests

This implementation follows the same per-iteration semantics as BiasReLU,
where clipping is applied at each reduction step rather than
post-reduction. This semantic change is documented in the docstring with
a warning about potential numerical differences.

The test suite verifies that all commutative forms of clipping patterns
are correctly recognized and that the fused implementation produces
numerically identical results to the per-iteration reference
implementation.
Currently, the FuseReductionEpilogue primitive only supports Bias
(addition) and BiasReLU (addition + ReLU) epilogue patterns. However,
clipping operations (min(max(x, lower), upper)) are commonly used in
deep learning models and would benefit from the same fusion optimization.

This commit extends FuseReductionEpilogue to support Clipping patterns
by:

1. Adding EpilogueType::Clipping to the enum to distinguish clipping
   patterns from other epilogue types.

2. Adding clipping_lower_ and clipping_upper_ members to
   ReductionEpilogueFuser to store clipping bounds extracted from the
   epilogue pattern.

3. Extending AnalyzeEpiloguePattern to detect clipping patterns:
   - min(max(temp, lower), upper)
   - max(min(temp, upper), lower)
   - All commutative variants of min/max at each level

4. Updating BiasReLU pattern matching to handle max(0, x) form in
   addition to max(x, 0) for better commutativity support.

5. Modifying CreateFusedReductionBlock to apply clipping to the init
   value: init = min(max(0, lower), upper)

6. Updating BufferReplacer to apply clipping per-iteration:
   value = min(max(value, lower), upper)

7. Adding validation in BodyPatternAllowFusion to ensure temp appears
   exactly once in clipping patterns.

8. Creating comprehensive test coverage with 8 test cases:
   - Basic fusion test
   - Numerical correctness verification
   - Multiple epilogue blocks test
   - 5 commutative variant tests

This implementation follows the same per-iteration semantics as BiasReLU,
where clipping is applied at each reduction step rather than
post-reduction. This semantic change is documented in the docstring with
a warning about potential numerical differences.

The test suite verifies that all commutative forms of clipping patterns
are correctly recognized and that the fused implementation produces
numerically identical results to the per-iteration reference
implementation.
@kimm240 kimm240 force-pushed the fix/fuse-reduction-epilogue-clipping-upstream branch from 9d4a68a to b074e04 Compare November 27, 2025 02:22
hyun gyu kim added 4 commits November 27, 2025 11:29
…ktrace

These submodules were incorrectly added but not defined in .gitmodules,
causing CI failures. They should not be tracked as submodules.
Currently, the FuseReductionEpilogue primitive only supports Bias
(addition) and BiasReLU (addition + ReLU) epilogue patterns. However,
clipping operations (min(max(x, lower), upper)) are commonly used in
deep learning models and would benefit from the same fusion optimization.

This commit extends FuseReductionEpilogue to support Clipping patterns
by:

1. Adding EpilogueType::Clipping to the enum to distinguish clipping
   patterns from other epilogue types.

2. Adding clipping_lower_ and clipping_upper_ members to
   ReductionEpilogueFuser to store clipping bounds extracted from the
   epilogue pattern.

3. Extending AnalyzeEpiloguePattern to detect clipping patterns:
   - min(max(temp, lower), upper)
   - max(min(temp, upper), lower)
   - All commutative variants of min/max at each level

4. Updating BiasReLU pattern matching to handle max(0, x) form in
   addition to max(x, 0) for better commutativity support.

5. Modifying CreateFusedReductionBlock to apply clipping to the init
   value: init = min(max(0, lower), upper)

6. Updating BufferReplacer to apply clipping per-iteration:
   value = min(max(value, lower), upper)

7. Adding validation in BodyPatternAllowFusion to ensure temp appears
   exactly once in clipping patterns.

8. Creating comprehensive test coverage with 8 test cases:
   - Basic fusion test
   - Numerical correctness verification
   - Multiple epilogue blocks test
   - 5 commutative variant tests

This implementation follows the same per-iteration semantics as BiasReLU,
where clipping is applied at each reduction step rather than
post-reduction. This semantic change is documented in the docstring with
a warning about potential numerical differences.

The test suite verifies that all commutative forms of clipping patterns
are correctly recognized and that the fused implementation produces
numerically identical results to the per-iteration reference
implementation.
Currently, the FuseReductionEpilogue primitive only supports Bias
(addition) and BiasReLU (addition + ReLU) epilogue patterns. However,
clipping operations (min(max(x, lower), upper)) are commonly used in
deep learning models and would benefit from the same fusion optimization.

This commit extends FuseReductionEpilogue to support Clipping patterns
by:

1. Adding EpilogueType::Clipping to the enum to distinguish clipping
   patterns from other epilogue types.

2. Adding clipping_lower_ and clipping_upper_ members to
   ReductionEpilogueFuser to store clipping bounds extracted from the
   epilogue pattern.

3. Extending AnalyzeEpiloguePattern to detect clipping patterns:
   - min(max(temp, lower), upper)
   - max(min(temp, upper), lower)
   - All commutative variants of min/max at each level

4. Updating BiasReLU pattern matching to handle max(0, x) form in
   addition to max(x, 0) for better commutativity support.

5. Modifying CreateFusedReductionBlock to apply clipping to the init
   value: init = min(max(0, lower), upper)

6. Updating BufferReplacer to apply clipping per-iteration:
   value = min(max(value, lower), upper)

7. Adding validation in BodyPatternAllowFusion to ensure temp appears
   exactly once in clipping patterns.

8. Creating comprehensive test coverage with 8 test cases:
   - Basic fusion test
   - Numerical correctness verification
   - Multiple epilogue blocks test
   - 5 commutative variant tests

This implementation follows the same per-iteration semantics as BiasReLU,
where clipping is applied at each reduction step rather than
post-reduction. This semantic change is documented in the docstring with
a warning about potential numerical differences.

The test suite verifies that all commutative forms of clipping patterns
are correctly recognized and that the fused implementation produces
numerically identical results to the per-iteration reference
implementation.
@kimm240
Copy link
Contributor Author

kimm240 commented Nov 27, 2025

@wrongtest-intellif
This PR implements the extension for other epilogue forms (ReLU/Clipping) as discussed in the previous PR #18418 review conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant