Skip to content

Commit b556110

Browse files
authored
PA Migration: Doc Updates (#925)
* PA Migration: Doc Updates
1 parent 0094dba commit b556110

File tree

5 files changed

+13
-13
lines changed

5 files changed

+13
-13
lines changed

docs/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
Copyright (c) 2020-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
Copyright (c) 2020-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
44
Licensed under the Apache License, Version 2.0 (the "License");
55
you may not use this file except in compliance with the License.
@@ -47,4 +47,4 @@ The User Guide describes how to configure Model Analyzer, choose launch and sear
4747

4848
The following resources are recommended:
4949

50-
- [Perf Analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md): Perf Analyzer is a CLI application built to generate inference requests and measures the latency of those requests and throughput of the model being served.
50+
- [Perf Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md): Perf Analyzer is a CLI application built to generate inference requests and measures the latency of those requests and throughput of the model being served.

docs/config.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
Copyright (c) 2020-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
44
Licensed under the Apache License, Version 2.0 (the "License");
55
you may not use this file except in compliance with the License.
@@ -718,7 +718,7 @@ but profile `model_2` using GPU.
718718
This field allows the user to pass `perf_analyzer` any CLI options it needs to
719719
execute properly. Refer to [the
720720
`perf_analyzer`
721-
docs](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
721+
docs](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
722722
for more information on these options.
723723

724724
### Global options to apply to all instances of Perf Analyzer
@@ -779,7 +779,7 @@ perf_analyzer_flags:
779779
If a model configuration has variable-sized dimensions in the inputs section,
780780
then the `shape` option of the `perf_analyzer_flags` option must be specified.
781781
More information about this can be found in the
782-
[Perf Analyzer documentation](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md#input-data).
782+
[Perf Analyzer documentation](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/input_data.md).
783783

784784
### SSL Support:
785785

@@ -810,7 +810,7 @@ profile_models:
810810
```
811811

812812
More information about this can be found in the
813-
[Perf Analyzer documentation](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md#ssltls-support).
813+
[Perf Analyzer documentation](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/measurements_metrics.md#ssltls-support).
814814

815815
#### **Important Notes**:
816816

docs/config_search.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ It has two modes:
9898

9999
The parameters that are automatically searched are
100100
[model maximum batch size](https://github.com/triton-inference-server/server/blob/master/docs/user_guide/model_configuration.md#maximum-batch-size),
101-
[model instance groups](https://github.com/triton-inference-server/server/blob/master/docs/user_guide/model_configuration.md#instance-groups), and [request concurrencies](https://github.com/triton-inference-server/server/blob/master/docs/user_guide/perf_analyzer.md#request-concurrency).
101+
[model instance groups](https://github.com/triton-inference-server/server/blob/master/docs/user_guide/model_configuration.md#instance-groups), and [request concurrencies](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/cli.md#measurement-options).
102102
Additionally, [dynamic_batching](https://github.com/triton-inference-server/server/blob/master/docs/user_guide/model_configuration.md#dynamic-batcher) will be enabled if it is legal to do so.
103103

104104
_An example model analyzer YAML config that performs an Automatic Brute Search:_
@@ -128,13 +128,13 @@ You can also modify the minimum/maximum values that the automatic search space w
128128

129129
---
130130

131-
### [Request Concurrency Search Space](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/docs/inference_load_modes.md#concurrency-mode))
131+
### [Request Concurrency Search Space](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/inference_load_modes.md#concurrency-mode)
132132

133133
- `Default:` 1 to 1024 concurrencies, sweeping over powers of 2 (i.e. 1, 2, 4, 8, ...)
134134
- `--run-config-search-min-concurrency: <val>`: Changes the request concurrency minimum automatic search space value
135135
- `--run-config-search-max-concurrency: <val>`: Changes the request concurrency maximum automatic search space value
136136

137-
### [Request Rate Search Space](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/docs/inference_load_modes.md#request-rate-mode)
137+
### [Request Rate Search Space](https://github.com/triton-inference-server/perf_analyzer/blob/main/docs/inference_load_modes.md#request-rate-mode)
138138

139139
- `Default:` 1 to 1024 concurrencies, sweeping over powers of 2 (i.e. 1, 2, 4, 8, ...)
140140
- `--run-config-search-min-request-rate: <val>`: Changes the request rate minimum automatic search space value
@@ -422,7 +422,7 @@ _This mode has the following limitations:_
422422

423423
- Summary/Detailed reports do not include the new metrics
424424

425-
In order to profile LLMs you must tell MA that the model type is LLM by setting `--model-type LLM` in the CLI/config file. You can specify CLI options to the GenAI-Perf tool using `genai_perf_flags`. See the [GenAI-Perf CLI](https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/perf_analyzer/genai-perf/README.md#cli) documentation for a list of the flags that can be specified.
425+
In order to profile LLMs you must tell MA that the model type is LLM by setting `--model-type LLM` in the CLI/config file. You can specify CLI options to the GenAI-Perf tool using `genai_perf_flags`. See the [GenAI-Perf CLI](https://github.com/triton-inference-server/perf_analyzer/blob/main/genai-perf/README.md#command-line-options) documentation for a list of the flags that can be specified.
426426

427427
LLMs can be optimized using either Quick or Brute search mode.
428428

docs/metrics.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
Copyright (c) 2020-2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
Copyright (c) 2020-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
44
Licensed under the Apache License, Version 2.0 (the "License");
55
you may not use this file except in compliance with the License.
@@ -24,7 +24,7 @@ tags, which are used in various places to configure Model Analyzer.
2424

2525
These metrics come from the perf analyzer and are parsed and processed by the
2626
model analyzer. See the [perf analyzer
27-
docs](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
27+
docs](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
2828
for more info on these
2929

3030
* `perf_throughput`: The number of inferences per second measured by the perf

docs/model_types.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ _Profiling this model type has the following limitations:_
119119

120120
- Summary/Detailed reports do not include the new metrics
121121

122-
In order to profile LLMs you must tell MA that the model type is LLM by setting `--model-type LLM` in the CLI/config file. You can specify CLI options to the GenAI-Perf tool using `genai_perf_flags`. See the [GenAI-Perf CLI](https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/perf_analyzer/genai-perf/README.md#cli) documentation for a list of the flags that can be specified.
122+
In order to profile LLMs you must tell MA that the model type is LLM by setting `--model-type LLM` in the CLI/config file. You can specify CLI options to the GenAI-Perf tool using `genai_perf_flags`. See the [GenAI-Perf CLI](https://github.com/triton-inference-server/perf_analyzer/blob/main/genai-perf/README.md) documentation for a list of the flags that can be specified.
123123

124124
LLMs can be optimized using either Quick or Brute search mode.
125125

0 commit comments

Comments
 (0)