Skip to content

Commit 7df8f67

Browse files
authored
Document how to profile multiple LoRA adapters (#676)
1 parent 9612fbe commit 7df8f67

File tree

2 files changed

+60
-5
lines changed

2 files changed

+60
-5
lines changed

src/c++/perf_analyzer/genai-perf/README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -345,14 +345,16 @@ Show the help message and exit.
345345
##### `-m <list>`
346346
##### `--model <list>`
347347

348-
The name of the model to benchmark. (default: `None`)
348+
The names of the models to benchmark.
349+
A single model is recommended, unless you are
350+
[profiling multiple LoRA adapters](docs/lora.md). (default: `None`)
349351

350352
##### `--model-selection-strategy {round_robin, random}`
351353

352-
When multiple model are specified, this is how a specific model
353-
should be assigned to a prompt. round_robin means that ith prompt in the
354-
list gets assigned to i mod len(models). random means that assignment is
355-
uniformly random (default: `round_robin`)
354+
When multiple models are specified, this is how a specific model
355+
is assigned to a prompt. Round robin means that each model receives
356+
a request in order. Random means that assignment is uniformly random
357+
(default: `round_robin`)
356358

357359
##### `--backend {tensorrtllm,vllm}`
358360

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
<!--
2+
Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
4+
Redistribution and use in source and binary forms, with or without
5+
modification, are permitted provided that the following conditions
6+
are met:
7+
* Redistributions of source code must retain the above copyright
8+
notice, this list of conditions and the following disclaimer.
9+
* Redistributions in binary form must reproduce the above copyright
10+
notice, this list of conditions and the following disclaimer in the
11+
documentation and/or other materials provided with the distribution.
12+
* Neither the name of NVIDIA CORPORATION nor the names of its
13+
contributors may be used to endorse or promote products derived
14+
from this software without specific prior written permission.
15+
16+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
17+
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
19+
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
20+
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
21+
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
22+
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
23+
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
24+
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
25+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27+
-->
28+
29+
# Profiling Multiple LoRA Adapters
30+
GenAI-Perf allows you to profile multiple LoRA adapters on top of a base model.
31+
32+
## Selecting LoRA Adapters
33+
To do this, list multiple adapters after the model name option `-m`:
34+
35+
```bash
36+
genai-perf -m lora_adapter1 lora_adapter2 lora_adapter3
37+
```
38+
39+
## Choosing a Strategy for Selecting Models
40+
When profiling with multiple models, you can specify how the models should be
41+
assigned to prompts using the `--model-selection-strategy` option:
42+
43+
```bash
44+
genai-perf \
45+
-m lora_adapter1 lora_adapter2 lora_adapter3 \
46+
--model-selection-strategy round_robin
47+
```
48+
49+
This setup will cycle through the lora_adapter1, lora_adapter2, and
50+
lora_adapter3 models in a round-robin manner for each prompt.
51+
52+
For more details on additional options and configurations, refer to the
53+
[Command Line Options section](../README.md#command-line-options) in the README.

0 commit comments

Comments
 (0)