-
Notifications
You must be signed in to change notification settings - Fork 923
feat: Enable for exporting unmerged HF Lora Adapter #6225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @jason9693, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement by allowing users to export unmerged LoRA adapters from Megatron Core models into a format compatible with HuggingFace's PEFT library. This feature streamlines the process of using LoRA fine-tuned models across different frameworks, providing flexibility and reducing the need for full model retraining. The changes involve adding a dedicated conversion module and integrating it into the existing model export pipeline, specifically targeting GPT models. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a feature to export unmerged LoRA adapters from Megatron Core to HuggingFace PEFT format. The implementation is well-structured, with the core logic encapsulated in a new file. However, I've identified a critical issue in the conversion script that could lead to exported adapter files being overwritten. Additionally, there are a couple of high-severity bugs in the LoRA weight conversion logic that would prevent exporting adapters with biases and would handle other parameters inconsistently. I've provided detailed comments and suggestions to address these issues.
|
Hello! Thanks for your PR. Could you please change the comments to English? 😊 |
Oh, I forgot to change the comments. |
|
@Jintao-Huang I just checked all of my known bugs remained and pushed it with conversion test code. +FYI) I was tested on Qwen3 A3B MoE model. |
|
Hello! I will review it as soon as possible today or tomorrow. Please wait. 😊 |
|
Could you use pre-commit to format the code to pass the lint tests? (use py310/py311) |
|
@Jintao-Huang I just checked pre-commit and commit that changes. |
|
@Jintao-Huang Is there any actions I need to follow? |
PR type
PR information
this PR is related to #5204

How to Test