Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feat: bf16 x mxfp4 cutlass fused moe for gpt-oss of hopper #23368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
feat: bf16 x mxfp4 cutlass fused moe for gpt-oss of hopper #23368
Changes from all commits
9fb66ff
07e376d
3f7d7ac
eb4cfac
4e78f74
62df10f
275a334
95ba0df
70bcc4b
52729d5
e396d2f
16ef143
f611b92
a219618
65ad494
72fda97
c446fb4
adb7678
d24c6bb
79e4f5a
3d2b6c3
ecf34ca
e27479b
b1ae1e2
b17cb00
3c2693d
496e3fe
465686c
6caa9f2
5fd03f5
b725016
04e9109
2e72687
93ff0c3
0bfdd80
581e0c0
041fa23
aa2eb6a
38c8f87
0ce4673
8120bd7
11652f2
a857d8d
76c2fa8
af610df
5904082
5f48728
50c1a08
8db45eb
51b5895
6c4c5ea
04c52c0
ba6499c
ab544cd
47d4185
612eab5
627c147
a22c39f
a57f6d2
eec4da9
af8ffba
4ab6bd4
9b6683f
8d808ce
456c8cf
02200dc
6c4a7f2
21ead32
4654781
e67c504
f2a8c6f
f1e219c
98b4d43
a60d0c7
d854c2a
5b5d22e
3cee7a3
2886073
1935c34
8c8aaf1
dc88091
28234fe
cbc33c1
efd10c3
a611c4b
0025ac6
10bd3f2
0691dba
e45076e
40a0d51
ed52e53
9931ad7
1330105
0c1d8f7
b1a3260
049cef9
9690747
1e8a902
db8f535
e6bc394
f924f5a
72d3950
9310d15
071fdbf
625926c
cf0a037
0a3d765
d117d48
d623acb
3f9a589
f562f66
2c46786
445e353
43de8bc
bbaa94c
beecdf8
a08fb18
0bf52cb
9b1f185
4d2db6f
481be6d
9ef6864
eeeb87d
a9b22c0
5f6b0a1
c1b8173
d9939e5
e9c1adb
f7ffaa3
9a8c210
8d737da
4c46375
5fee7b8
2030ddb
28f315d
4858a70
bb34190
1cd3c15
647d69b
321dcd6
299b096
32cb16c
56fa841
f41be30
4862644
f189ec0
cbaba9d
f4a7919
c1a3d12
3b7e373
6697ac7
c8a53e5
b6f3b11
2d3c47f
ef1fa1d
40a6d44
09a6735
b51df6a
0f68e55
9a62d10
5870cad
d63fd65
ec89a52
82061bc
a1cb9fb
c153756
e71f229
dd532ae
b515118
648cdaf
e85b346
c43ca52
44862d8
536b4a2
b2fd7cc
3cfcd13
3e32704
bd65d52
ee0dd04
843e77b
b96ca94
29f58a0
7bc51c2
b15629b
2de3c7b
240e099
35b1c74
285cd2b
c826d11
58afbd2
582c727
c0eb3d7
c68cadb
fa40ad3
8396597
410423e
e0be5ba
649fcea
f54d68b
a473c5b
b1602a8
8c26b47
97b7516
c300639
24c8bb6
a0c60ea
0382521
93c5489
83fb982
048330f
6ae6cf1
453d898
5833876
352d13e
857da6c
56dd418
b4d2a4a
66a8d24
1dc73ba
ac51913
c5e2aee
15195dc
3a0ee9f
c16f981
1542ce3
a86eaa5
d7d87dc
3d3c649
bd792f6
f329657
c98c1db
98fa266
File filter
Filter by extension
Conversations
Uh oh!
There was an error while loading. Please reload this page.
Jump to
Uh oh!
There was an error while loading. Please reload this page.
There are no files selected for viewing
Check failure on line 46 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 185 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 393 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 394 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 395 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 396 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 397 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 398 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 436 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (E501)
Check failure on line 683 in vllm/model_executor/layers/quantization/mxfp4.py
Ruff (F821)