[0.34.0-dlc][python] Support adapters in async vLLM handler#2901
[0.34.0-dlc][python] Support adapters in async vLLM handler#2901ethnzhng merged 20 commits into0.34.0-dlcfrom
Conversation
Co-authored-by: Ubuntu <ubuntu@ip-172-31-41-102.us-west-2.compute.internal> Co-authored-by: Suma Kasa <sumakasa@amazon.com>
Co-authored-by: Suma Kasa <sumakasa@amazon.com>
Co-authored-by: Suma Kasa <sumakasa@amazon.com>
|
@ethnzhng What happens to existing lora adapters if the engine/worker/container restarts ? |
Upon engine restart, previously registered adapters which were loaded from a location other than |
Adapters saved in /opt/ml/model/adapters will be restored. |
Description
huggingface.pyhandlerType of change
Please delete options that are not relevant.
Checklist:
pytest tests.py -k "TestCorrectnessLmiDist" -m "lmi_dist"Feature/Issue validation/testing
Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Logs for Test A
Test B
Logs for Test B