-
Notifications
You must be signed in to change notification settings - Fork 531
misc: fix some B200 GEMM bench #1883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @Edenzzzz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical runtime error in the B200 GEMM benchmark by correctly specifying a missing layout parameter, ensuring the benchmark can execute as intended. Additionally, it enhances the TGV GEMM benchmark by integrating TFLOPS calculations into the output, offering a more comprehensive view of performance for various GEMM implementations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes a benchmark for B200 GEMM by adding the required scale_major_mode
parameter and enhances another benchmark by adding TFLOPS reporting. The changes are correct and improve the benchmarks. I have one suggestion to improve code readability by replacing magic numbers with named constants.
for m, n, k, has_bias, description in test_cases: | ||
print(f"\n--- {description}: M={m}, N={n}, K={k}, has_bias={has_bias} ---") | ||
|
||
flops = m * n * k * 2 / 1e12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To improve readability and maintainability, it's a good practice to avoid using magic numbers. Consider defining 2
and 1e12
as named constants with descriptive names.
flops = m * n * k * 2 / 1e12 | |
FLOPS_PER_MAC = 2 | |
TFLOPS_SCALE = 1e12 | |
flops = (m * n * k * FLOPS_PER_MAC) / TFLOPS_SCALE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. LGTM
@Edenzzzz, to address the CI failures, you I suggest you try rebasing your branch or merging from main. |
📌 Description
Before it couldn't run due to missing layout

After
🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commit
by runningpip install pre-commit
(or used your preferred method).pre-commit install
.pre-commit run --all-files
and fixed any reported issues.🧪 Tests
unittest
, etc.).Reviewer Notes