Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

trivedivivek · 2025-10-29T18:09:41Z

Summary: This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the uint to float conversion operation.

Differential Revision: D85779855

pytorch-bot · 2025-10-29T18:09:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15447

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 4745c30 with merge base c230e94 ():

NEW FAILURE - The following job has failed:

pull / unittest / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_linear_model

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-10-29T18:09:50Z

@trivedivivek has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85779855.

…U pipes by splitting uint to float conversion operation. (pytorch#15447) Summary: This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the `uint` to `float` conversion operation. Differential Revision: D85779855

…U pipes by splitting uint to float conversion operation. (pytorch#15447) Summary: This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the `uint` to `float` conversion operation. Reviewed By: SS-JIA Differential Revision: D85779855

trivedivivek requested a review from SS-JIA as a code owner October 29, 2025 18:09

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 29, 2025

meta-codesync bot added fb-exported meta-exported labels Oct 29, 2025

trivedivivek added the release notes: vulkan Changes to the Vulkan backend delegate label Oct 29, 2025

trivedivivek force-pushed the export-D85779855 branch from 9db3ed8 to 35be9f2 Compare October 30, 2025 03:04

trivedivivek force-pushed the export-D85779855 branch from 35be9f2 to 88efba0 Compare October 30, 2025 14:18

trivedivivek force-pushed the export-D85779855 branch from 88efba0 to ff1b33a Compare October 30, 2025 14:19

SS-JIA approved these changes Oct 30, 2025

View reviewed changes

trivedivivek force-pushed the export-D85779855 branch from ff1b33a to 22f63f2 Compare October 31, 2025 05:24

trivedivivek force-pushed the export-D85779855 branch from 22f63f2 to 5639672 Compare October 31, 2025 05:25

trivedivivek force-pushed the export-D85779855 branch from 5639672 to 484eea6 Compare October 31, 2025 13:30

trivedivivek force-pushed the export-D85779855 branch from 484eea6 to 3dd5453 Compare October 31, 2025 23:09

trivedivivek force-pushed the export-D85779855 branch from 3dd5453 to 5d5f0cc Compare October 31, 2025 23:12

trivedivivek force-pushed the export-D85779855 branch from 5d5f0cc to 4b3acd1 Compare October 31, 2025 23:12

trivedivivek force-pushed the export-D85779855 branch from 4b3acd1 to 0871896 Compare November 1, 2025 00:05

trivedivivek force-pushed the export-D85779855 branch from 0871896 to 4745c30 Compare November 1, 2025 00:06

meta-codesync bot merged commit 7d18005 into pytorch:main Nov 1, 2025
145 of 147 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Uh oh!

trivedivivek commented Oct 29, 2025

Uh oh!

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Uh oh!

Conversation

trivedivivek commented Oct 29, 2025

Uh oh!

pytorch-bot bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15447

❌ 1 New Failure

Uh oh!

meta-codesync bot commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading