Skip to content

LoRA support in vllm and hf backends#56

Merged
vicky-xef merged 51 commits intomainfrom
vicky/lora
Jan 30, 2026
Merged

LoRA support in vllm and hf backends#56
vicky-xef merged 51 commits intomainfrom
vicky/lora

Conversation

@vicky-xef
Copy link
Contributor

@vicky-xef vicky-xef commented Dec 8, 2025

By default, there is no requirement to load LoRA adapter.

In vllm.py file:

  • Added lora_request attribute (initialized to None, meaning LoRA is not used). This value is passed to every engine.generate(), engine.add_request() call.
  • Added a add_new_lora() method that hashes a lora name to an id (kept in the lora_name_to_ids dictionary). Both lora name and id should be used in vllm.
  • Added set_lora() method that loads a LoRA adapter by updating the lora_request attribute.
  • Added clear_lora() method that resets lora_request to None (disabling LoRA usage).
  • The user should set enable_lora=True in the engine_opts.

In hf.py file:

  • Added a add_new_lora() method that loads a LoRA adapter.
  • Added a set_lora() method that activates the loaded LoRA adapter.
  • Added a clear_lora() method that deactivates the LoRA adapter.

@vicky-xef vicky-xef requested a review from benlebrun December 8, 2025 12:03
@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

❌ Patch coverage is 96.77419% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
genlm/backend/llm/vllm.py 93.75% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@vicky-xef vicky-xef marked this pull request as draft December 8, 2025 12:59
@vicky-xef vicky-xef marked this pull request as ready for review December 15, 2025 10:25
"""
self.lora_request = None

def set_lora(self, lora_path, lora_name="current_lora", lora_id=1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why the method signature is different between VLLM and HF? That should be avoided since both of these classes are supposed to implement the same interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At HF there are two methods: load_lora loads LoRA weights and set_lora actually activates it (someone may need to load two different LoRAs, activate the first one, then deactivate it and activate the other one after etc). At vllm, we only need to pass the LoRA adapter in the vllm request, that's why we only need to have the set_lora method. Should I name them differently?

Copy link
Member

@postylem postylem Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the method signatures match across vLLM and HF backends, a potential solution would be to add a def load_lora(self, lora_path, lora_name='...') method in vllm (that is, with the same signature as in the hf backend), here in vllm this method would simply associate the passed lora_path value to the passed lora_name key. That way you could then def set_lora(self, lora_name='...') just like in the hf backend. It would not need to take in the other args, and would simply use the path associated with the lora_name, provided you 'loaded' it already.
Would something like that make sense, @vicky-xef ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed load_lora to add_new_lora, added it to vllm, where it generates a hashed id for the lora adapter (both id, lora name and lora path are necessary in vllm request). So, now both vllm and hf have a common interface.

@vicky-xef vicky-xef requested a review from benlebrun January 22, 2026 12:16
Copy link
Member

@benlebrun benlebrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome @vicky-xef !

@vicky-xef vicky-xef merged commit c305a35 into main Jan 30, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments