[PyTorch] Documentation for op fuser API #2447

timmoon10 · 2025-12-03T05:57:58Z

Description

This PR adds a basic usage guide for the op fuser and includes it in the autogenerated API docs.

It is ready as-is, but if reviews take a while I may expand it with a guide on creating custom fused ops.

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Add basic usage guide for op fuser
Include TE ops in autogenerated API docs
Debug TE ops docstrings

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Tim Moon <[email protected]>

greptile-apps · 2025-12-03T06:00:44Z

Greptile Overview

Greptile Summary

This PR adds comprehensive documentation for the operation fuser API, a "bottom-up" alternative to Transformer Engine's monolithic modules that allows users to construct and fuse individual operations flexibly.

Key Changes

New Documentation (251 lines)

Created docs/examples/op_fuser/op_fuser.rst with detailed usage guide covering:
- Motivation for the op fuser approach (flexibility vs. monolithic modules)
- Basic usage with Sequential and FusibleOperation
- Quantization workflows with FP8/FP4
- Branching operations (AddExtraInput, MakeExtraOutput)
- Implementation details (BasicOperation, FusedOperation, OperationFuser)
- Common misconceptions (not a kernel compiler, not a graph compiler)
Includes three diagrams showing operation fusion examples

API Documentation Updates

Added "Operation fuser" section to docs/api/pytorch.rst with 26+ operations:
- Container classes: Sequential, FusibleOperation
- Basic operations: Linear, BasicLinear, LayerNorm, RMSNorm, Bias, etc.
- Activation functions: GELU, SwiGLU, ReLU, GEGLU, etc.
- Utility operations: Quantize, Reshape, AddExtraInput, MakeExtraOutput
- Distributed operations: AllGather, AllReduce, ReduceScatter

Docstring Fixes (15 files)

Fixed RST formatting: single backticks (`) to double backticks (``) for inline code
Fixed hyperlink formatting: added spaces before closing backticks (e.g., link <url>__ instead of link<url>__)
Standardized boolean/None literals: `False` → False, `None` → None
Improved docstring structure and readability

Minor Fixes

Fixed .gitignore: changed *.DS_Store to .DS_Store (more accurate pattern)

Confidence Score: 5/5

This PR is completely safe to merge - it contains only documentation improvements with no code logic changes
Perfect confidence score because: (1) All changes are documentation-only - new RST guide, API reference updates, and docstring formatting fixes, (2) No functional code changes that could introduce bugs, (3) Docstring fixes improve documentation quality and RST rendering, (4) The new op fuser guide is comprehensive and well-structured with clear examples
No files require special attention - all changes are documentation improvements

Important Files Changed

File Analysis

Filename	Score	Overview
docs/examples/op_fuser/op_fuser.rst	5/5	New comprehensive documentation guide for op fuser API with examples and diagrams
docs/api/pytorch.rst	5/5	Added Operation fuser section with 26 fusible operations to API documentation
.gitignore	5/5	Fixed .DS_Store pattern from *.DS_Store to .DS_Store
transformer_engine/pytorch/ops/basic/activation.py	5/5	Fixed RST hyperlink formatting (added spaces before closing backticks) and improved docstring structure
transformer_engine/pytorch/ops/basic/basic_linear.py	5/5	Fixed docstring formatting: single to double backticks, False/None literal formatting

Sequence Diagram

sequenceDiagram
    participant User
    participant Sequential
    participant OperationFuser
    participant BasicOps
    participant FusedOps
    
    User->>Sequential: forward(input)
    Sequential->>Sequential: _make_module_groups()
    Sequential->>OperationFuser: __call__(input)
    OperationFuser->>OperationFuser: maybe_fuse_ops()
    OperationFuser->>BasicOps: Analyze fusion opportunities
    OperationFuser->>FusedOps: Create fused operations
    FusedOps->>BasicOps: fuser_forward()
    BasicOps-->>FusedOps: output
    FusedOps-->>OperationFuser: output
    OperationFuser-->>Sequential: output
    Sequential-->>User: output

Signed-off-by: Tim Moon <[email protected]>

@greptile-apps

Review suggestion from @greptile-apps Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <[email protected]>

docs/examples/op_fuser/op_fuser.rst

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <[email protected]>

greptile-apps

Additional Comments (1)

transformer_engine/pytorch/ops/basic/activation.py, line 387 (link)

syntax: Extra space before period.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

_{19 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Tim Moon <[email protected]>

timmoon10 · 2025-12-22T22:26:36Z

/te-ci core pytorch

Signed-off-by: Tim Moon <[email protected]>

docs/examples/op_fuser/op_fuser.rst

Signed-off-by: Tim Moon <[email protected]>

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

ptrendx · 2026-01-14T19:41:03Z

docs/examples/op_fuser/op_fuser.rst

+At the most basic level, the operation fuser API involves two classes
+in the ``transformer_engine.pytorch.ops`` submodule:
+
+- ``FusibleOperation``: An abstract base class for tensor operations.
+  Examples include ``Linear``, ``LayerNorm``, and ``AllReduce``. It is
+  a subclass of ``torch.nn.Module``, so it can hold trainable
+  parameters and can be called to perform the operation's forward
+  pass.
+- ``Sequential``: A container of modules in sequential order. It has a
+  very similar interface as ``torch.nn.Sequential``. If it contains
+  any ``FusibleOperation`` s, then it may attempt to fuse them in the
+  forward and backward passes.
+
+Thus, using the operation fuser simply involves constructing
+``FusibleOperation`` s and passing them into a ``Sequential``.


Who is the intended audience of this documentation? On one hand it seems it is the user (since you show examples of how things could be written), on the other you also include the details of the implementation.

ptrendx · 2026-01-14T19:43:24Z

docs/examples/op_fuser/op_fuser.rst

+   This is an expert technique. Quantizer configurations can be quite
+   complicated, so the ``Quantize`` operation's quantizers may be
+   suboptimal.


Not sure what that means - any examples?

For MXFP8, it's not safe for the quantize op to produce a MXFP8Tensor with swizzled scales. There's no way to know if it will consumed by a GEMM or by something else.

ptrendx · 2026-01-14T19:47:02Z

docs/examples/op_fuser/op_fuser.rst

+   the block has been split into two sections, each with one branching
+   operation.
+
+Implementation details


Yeah, I think this file should be split into 2 (maybe 3) separate sections - one primarily user facing with the sections describing how to use sequential, maybe second one showing how to define your own fusion with a user-provided kernel, and then the third one showing those internal implementation details.

ptrendx · 2026-01-14T19:48:53Z

docs/examples/op_fuser/op_fuser.rst

+- **The op fuser is not interchangeable with the monolithic TE
+  modules**: Modules like ``Linear``, ``LayerNormLinear``, and
+  ``TransformerLayer`` support a wide range of features and advanced
+  workflows, which makes them challenging to decompose into simple
+  operations that work with the fuser. They are also carefully
+  hand-tuned to achieve maximum performance.


We would like to get to the point where the sequential is the default, right? So while right now this is true, it may not be in the future.

timmoon10 added 2 commits December 2, 2025 20:50

Add documentation for operation fuser API

61e7ae1

Signed-off-by: Tim Moon <[email protected]>

Include TE ops in PyTorch API docs

4ca507e

Signed-off-by: Tim Moon <[email protected]>

timmoon10 requested review from ksivaman, ptrendx and vthumbe1503 December 3, 2025 05:57

timmoon10 added the documentation Improvements or additions to documentation label Dec 3, 2025

This comment was marked as resolved.

Sign in to view

timmoon10 and others added 2 commits December 2, 2025 22:03

Fix error when building docs

ee35af6

Signed-off-by: Tim Moon <[email protected]>

Fix typo

2a0111b

Review suggestion from @greptile-apps Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <[email protected]>

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

timmoon10 commented Dec 3, 2025

View reviewed changes

docs/examples/op_fuser/op_fuser.rst Outdated Show resolved Hide resolved

Fix swapped args to te.ops.Linear

5102852

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <[email protected]>

This comment was marked as outdated.

Sign in to view

Merge branch 'main' into tmoon/te-ops-docs

544785c

greptile-apps bot reviewed Dec 17, 2025

View reviewed changes

pggPL self-requested a review December 17, 2025 12:47

Merge branch 'main' into tmoon/te-ops-docs

8d5063b

Signed-off-by: Tim Moon <[email protected]>

This comment was marked as resolved.

Sign in to view

timmoon10 added the 2.12.0 label Jan 8, 2026

Merge branch 'main' into tmoon/te-ops-docs

f10fa01

Signed-off-by: Tim Moon <[email protected]>

timmoon10 commented Jan 10, 2026

View reviewed changes

docs/examples/op_fuser/op_fuser.rst Outdated Show resolved Hide resolved

Update copyright year

d44c5fb

Signed-off-by: Tim Moon <[email protected]>

This comment was marked as outdated.

Sign in to view

greptile-apps bot reviewed Jan 10, 2026

View reviewed changes

ptrendx reviewed Jan 14, 2026

View reviewed changes

[PyTorch] Documentation for op fuser API #2447

Are you sure you want to change the base?

[PyTorch] Documentation for op fuser API #2447

Uh oh!

Conversation

timmoon10 commented Dec 3, 2025

Description

Type of change

Changes

Checklist:

Uh oh!

greptile-apps bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Key Changes

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (1)

Uh oh!

timmoon10 commented Dec 22, 2025

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

ptrendx Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

ptrendx Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

timmoon10 Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ptrendx Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

ptrendx Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Dec 3, 2025 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

timmoon10 Jan 15, 2026 •

edited

Loading