-
Notifications
You must be signed in to change notification settings - Fork 531
ci/cd: add nightly build and CI for flashinfer-python
, flashinfer-jit-cache
, flashinfer-cubin
#1872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @yzh119, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the CI/CD infrastructure for FlashInfer's Python, JIT cache, and Cubin packages, primarily to support robust nightly builds. Key improvements include a more granular versioning system that incorporates Git commit hashes and development suffixes, a sophisticated wheel index generation script capable of handling various package types and CUDA versions, and a streamlined approach to managing cubin artifacts. These changes ensure better traceability, clearer release identification, and a more efficient distribution pipeline for FlashInfer components. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds nightly build and CI capabilities, including embedding git version information into the packages. The changes are generally good, but there are a few important issues to address.
Most critically, there's a leftover line of debugging code in flashinfer/aot.py
that will break the AOT build. I've also identified a high-severity issue in scripts/update_whl_index.py
where the generated HTML for the wheel index is not valid according to PEP 503. Additionally, there's an opportunity to improve maintainability by refactoring a duplicated _get_git_version
function used across several build scripts.
Please see the detailed comments for suggestions on how to fix these issues.
The binary bloat issue is fixed in 06ffe13 (introduced in #1659) The cubloaty tool helps identifying that most of the binary size comes from debug info Before the commit: Non-Code Sections (Debug Info, Metadata, etc.)
╭───────────────────────────┬─────────────────┬──────────────╮
│ Section Type │ Total Size │ % of Total │
├───────────────────────────┼─────────────────┼──────────────┤
│ debug info │ 159.4MB │ 82.9% │
│ metadata │ 3.5MB │ 1.8% │
│ data sections │ 798.6KB │ 0.4% │
├───────────────────────────┼─────────────────┼──────────────┤
│ SUBTOTAL │ 163.7MB │ 85.1% │
╰───────────────────────────┴─────────────────┴──────────────╯ After the commit, the ratio of non-code sections drops from 85% to 21%. |
/bot run |
os.makedirs(jit_env.FLASHINFER_CSRC_DIR, exist_ok=True) | ||
|
||
|
||
class MissingJITCacheError(RuntimeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to utils where other errors are declared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It introduce some annoying circular dependencies, maybe we can refactor the structure that after this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
📌 Description
Duplicate of #1867, created from flashinfer/nightly to get write permission.
🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commit
by runningpip install pre-commit
(or used your preferred method).pre-commit install
.pre-commit run --all-files
and fixed any reported issues.🧪 Tests
unittest
, etc.).Reviewer Notes