Document how to profile multiple LoRA adapters (#676)

the-david-oy · web-flow · commit 7df8f6736c59 · 2024-05-24T15:13:09.000-07:00
diff --git a/src/c++/perf_analyzer/genai-perf/README.md b/src/c++/perf_analyzer/genai-perf/README.md
@@ -345,14 +345,16 @@ Show the help message and exit.
 ##### `-m <list>`
 ##### `--model <list>`
 
-The name of the model to benchmark. (default: `None`)
+The names of the models to benchmark.
+A single model is recommended, unless you are
+[profiling multiple LoRA adapters](docs/lora.md). (default: `None`)
 
 ##### `--model-selection-strategy {round_robin, random}`
 
-When multiple model are specified, this is how a specific model
-should be assigned to a prompt.  round_robin means that ith prompt in the
-list gets assigned to i mod len(models).  random means that assignment is
-uniformly random (default: `round_robin`)
+When multiple models are specified, this is how a specific model
+is assigned to a prompt. Round robin means that each model receives
+a request in order. Random means that assignment is uniformly random
+(default: `round_robin`)
 
 ##### `--backend {tensorrtllm,vllm}`
 
diff --git a/src/c++/perf_analyzer/genai-perf/docs/lora.md b/src/c++/perf_analyzer/genai-perf/docs/lora.md
@@ -0,0 +1,53 @@
+<!--
+Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+ * Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+ * Neither the name of NVIDIA CORPORATION nor the names of its
+   contributors may be used to endorse or promote products derived
+   from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
+EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+-->
+
+# Profiling Multiple LoRA Adapters
+GenAI-Perf allows you to profile multiple LoRA adapters on top of a base model.
+
+## Selecting LoRA Adapters
+To do this, list multiple adapters after the model name option `-m`:
+
+```bash
+genai-perf -m lora_adapter1 lora_adapter2 lora_adapter3
+```
+
+## Choosing a Strategy for Selecting Models
+When profiling with multiple models, you can specify how the models should be
+assigned to prompts using the `--model-selection-strategy` option:
+
+```bash
+genai-perf \
+    -m lora_adapter1 lora_adapter2 lora_adapter3 \
+    --model-selection-strategy round_robin
+```
+
+This setup will cycle through the lora_adapter1, lora_adapter2, and
+lora_adapter3 models in a round-robin manner for each prompt.
+
+For more details on additional options and configurations, refer to the
+[Command Line Options section](../README.md#command-line-options) in the README.