Skip to content

Conversation

@jason9693
Copy link

@jason9693 jason9693 commented Oct 20, 2025

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

this PR is related to #5204
image

  • Enable to export HF compatible LoRA Adapter
  • MoE Fused weight not supported ( vLLM's recommend set (issue))
  • only for gpt (megatron model) now.
  • test code (WIP)

How to Test

  1. train with lora script
# 2 * 62GiB, 5.10s/it
PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' \
NPROC_PER_NODE=2 \
CUDA_VISIBLE_DEVICES=0,1 \
megatron sft \
    --load qwen/Qwen3-30B-A3B-mcore \
    --model_type qwen3_moe \
    --dataset  'Qwen3_dataset_0801/data.jsonl' \
    --columns '{"conversations": "messages", "tools": "tools"}' \
    --agent_template hermes \
    --loss_scale ignore_empty_think \
    --train_type lora \
    --lora_rank 8 \
    --lora_alpha 32 \
    --target_modules linear_qkv linear_proj \
    --split_dataset_ratio 0.01 \
...
  1. convert lora without merge_lora option.
model_path=${1}

CUDA_VISIBLE_DEVICES=0 \
swift export \
    --mcore_adapters ${model_path} \
    --to_hf true \
    --torch_dtype bfloat16 \
    --output_dir ${model_path}_peft\
    --merge_lora false  ### IMPORTANT ###
  1. test conversion
python tests/megatron/test_lora_export.py
  1. load on vllm
vllm serve ${BACKBONE_MODEL_PATH} --enable-lora --lora-modules test1=${MODEL_PATH} test2=${MODEL_PATH} \
--enable-auto-tool-choice --tool-call-parser hermes --max_lora_rank 64 --max_loras 2 --host 0.0.0.0 --port 8000  > ${SERVE_LOG_PATH} 2>&1 &

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @jason9693, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement by allowing users to export unmerged LoRA adapters from Megatron Core models into a format compatible with HuggingFace's PEFT library. This feature streamlines the process of using LoRA fine-tuned models across different frameworks, providing flexibility and reducing the need for full model retraining. The changes involve adding a dedicated conversion module and integrating it into the existing model export pipeline, specifically targeting GPT models.

Highlights

  • HuggingFace LoRA Adapter Export: This pull request enables the export of unmerged LoRA adapters from Megatron Core models to a HuggingFace PEFT compatible format, allowing for greater interoperability.
  • New Conversion Module: A new module, mcore2hf_lora.py, has been introduced to handle the complex logic of converting various Megatron Core LoRA types (e.g., fused QKV, MLP projections, MLA attention) into their HuggingFace PEFT equivalents.
  • Conditional LoRA Handling: The conversion utility now includes logic to either merge LoRA weights into the base model or export them as a separate adapter, based on command-line arguments (--to-hf and --merge-lora).
  • GPT Model Specific: The LoRA adapter conversion functionality is currently implemented and supported exclusively for GPT (Megatron model) architectures.
  • Module Key Remapping: The conversion process includes remapping of module keys (e.g., decoder to model, self_attention to self_attn) to align with HuggingFace's naming conventions.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a feature to export unmerged LoRA adapters from Megatron Core to HuggingFace PEFT format. The implementation is well-structured, with the core logic encapsulated in a new file. However, I've identified a critical issue in the conversion script that could lead to exported adapter files being overwritten. Additionally, there are a couple of high-severity bugs in the LoRA weight conversion logic that would prevent exporting adapters with biases and would handle other parameters inconsistently. I've provided detailed comments and suggestions to address these issues.

@Jintao-Huang
Copy link
Collaborator

Hello! Thanks for your PR. Could you please change the comments to English? 😊

@jason9693
Copy link
Author

jason9693 commented Oct 21, 2025

Hello! Thanks for your PR. Could you please change the comments to English? 😊

Oh, I forgot to change the comments.
I'll change this ASAP!

@jason9693
Copy link
Author

jason9693 commented Oct 21, 2025

@Jintao-Huang I just checked all of my known bugs remained and pushed it with conversion test code.
you can run test on this file.

+FYI) I was tested on Qwen3 A3B MoE model.

@Jintao-Huang
Copy link
Collaborator

Hello! I will review it as soon as possible today or tomorrow. Please wait. 😊

@Jintao-Huang
Copy link
Collaborator

Could you use pre-commit to format the code to pass the lint tests?

(use py310/py311)

pip install pre-commit
pre-commit run --all-files

@jason9693
Copy link
Author

@Jintao-Huang I just checked pre-commit and commit that changes.

@jason9693
Copy link
Author

@Jintao-Huang Is there any actions I need to follow?

@jason9693 jason9693 changed the title Enable for exporting unmerged HF Lora Adapter feat: Enable for exporting unmerged HF Lora Adapter Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants