-
Notifications
You must be signed in to change notification settings - Fork 607
[PyTorch] Documentation for op fuser API #2447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Tim Moon <[email protected]>
Signed-off-by: Tim Moon <[email protected]>
Greptile OverviewGreptile SummaryThis PR adds comprehensive documentation for the operation fuser API, a "bottom-up" alternative to Transformer Engine's monolithic modules that allows users to construct and fuse individual operations flexibly. Key ChangesNew Documentation (251 lines)
API Documentation Updates
Docstring Fixes (15 files)
Minor Fixes
Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant Sequential
participant OperationFuser
participant BasicOps
participant FusedOps
User->>Sequential: forward(input)
Sequential->>Sequential: _make_module_groups()
Sequential->>OperationFuser: __call__(input)
OperationFuser->>OperationFuser: maybe_fuse_ops()
OperationFuser->>BasicOps: Analyze fusion opportunities
OperationFuser->>FusedOps: Create fused operations
FusedOps->>BasicOps: fuser_forward()
BasicOps-->>FusedOps: output
FusedOps-->>OperationFuser: output
OperationFuser-->>Sequential: output
Sequential-->>User: output
|
Signed-off-by: Tim Moon <[email protected]>
Review suggestion from @greptile-apps Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <[email protected]>
This comment was marked as outdated.
This comment was marked as outdated.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
transformer_engine/pytorch/ops/basic/activation.py, line 387 (link)syntax: Extra space before period.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
19 files reviewed, 1 comment
Signed-off-by: Tim Moon <[email protected]>
|
/te-ci core pytorch |
Signed-off-by: Tim Moon <[email protected]>
Signed-off-by: Tim Moon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No files reviewed, no comments
| At the most basic level, the operation fuser API involves two classes | ||
| in the ``transformer_engine.pytorch.ops`` submodule: | ||
|
|
||
| - ``FusibleOperation``: An abstract base class for tensor operations. | ||
| Examples include ``Linear``, ``LayerNorm``, and ``AllReduce``. It is | ||
| a subclass of ``torch.nn.Module``, so it can hold trainable | ||
| parameters and can be called to perform the operation's forward | ||
| pass. | ||
| - ``Sequential``: A container of modules in sequential order. It has a | ||
| very similar interface as ``torch.nn.Sequential``. If it contains | ||
| any ``FusibleOperation`` s, then it may attempt to fuse them in the | ||
| forward and backward passes. | ||
|
|
||
| Thus, using the operation fuser simply involves constructing | ||
| ``FusibleOperation`` s and passing them into a ``Sequential``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who is the intended audience of this documentation? On one hand it seems it is the user (since you show examples of how things could be written), on the other you also include the details of the implementation.
| This is an expert technique. Quantizer configurations can be quite | ||
| complicated, so the ``Quantize`` operation's quantizers may be | ||
| suboptimal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what that means - any examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For MXFP8, it's not safe for the quantize op to produce a MXFP8Tensor with swizzled scales. There's no way to know if it will consumed by a GEMM or by something else.
| the block has been split into two sections, each with one branching | ||
| operation. | ||
|
|
||
| Implementation details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this file should be split into 2 (maybe 3) separate sections - one primarily user facing with the sections describing how to use sequential, maybe second one showing how to define your own fusion with a user-provided kernel, and then the third one showing those internal implementation details.
| - **The op fuser is not interchangeable with the monolithic TE | ||
| modules**: Modules like ``Linear``, ``LayerNormLinear``, and | ||
| ``TransformerLayer`` support a wide range of features and advanced | ||
| workflows, which makes them challenging to decompose into simple | ||
| operations that work with the fuser. They are also carefully | ||
| hand-tuned to achieve maximum performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would like to get to the point where the sequential is the default, right? So while right now this is true, it may not be in the future.
Description
This PR adds a basic usage guide for the op fuser and includes it in the autogenerated API docs.
It is ready as-is, but if reviews take a while I may expand it with a guide on creating custom fused ops.
Type of change
Changes
Checklist: