Skip to content

Commit 9ad856c

Browse files
authored
chore:Purge PA from Client Repo (#7488)
* PA Migration: Update server docs and tests
1 parent cca12f9 commit 9ad856c

File tree

16 files changed

+53
-42
lines changed

16 files changed

+53
-42
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -179,7 +179,7 @@ configuration](docs/user_guide/model_configuration.md) for the model.
179179
[Backend-Platform Support Matrix](https://github.com/triton-inference-server/backend/blob/main/docs/backend_platform_support_matrix.md)
180180
to learn which backends are supported on your target platform.
181181
- Learn how to [optimize performance](docs/user_guide/optimization.md) using the
182-
[Performance Analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
182+
[Performance Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
183183
and
184184
[Model Analyzer](https://github.com/triton-inference-server/model_analyzer)
185185
- Learn how to [manage loading and unloading models](docs/user_guide/model_management.md) in

deploy/gke-marketplace-app/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright (c) 2021-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright (c) 2021-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -172,7 +172,7 @@ The client example push about ~650 QPS(Query per second) to Triton Server, and w
172172
![Locust Client Chart](client.png)
173173

174174
Alternatively, user can opt to use
175-
[Perf Analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
175+
[Perf Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
176176
to profile and study the performance of Triton Inference Server. Here we also
177177
provide a
178178
[client script](https://github.com/triton-inference-server/server/tree/master/deploy/gke-marketplace-app/client-sample/perf_analyzer_grpc.sh)

deploy/k8s-onprem/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright (c) 2018-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -295,7 +295,7 @@ Image 'images/mug.jpg':
295295
After you have confirmed that your Triton cluster is operational and can perform inference,
296296
you can test the load balancing and autoscaling features by sending a heavy load of requests.
297297
One option for doing this is using the
298-
[perf_analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
298+
[perf_analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
299299
application.
300300

301301
You can apply a progressively increasing load with a command like:

docs/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright 2018-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -173,7 +173,7 @@ Understanding Inference performance is key to better resource utilization. Use T
173173
- [Performance Tuning Guide](user_guide/performance_tuning.md)
174174
- [Optimization](user_guide/optimization.md)
175175
- [Model Analyzer](user_guide/model_analyzer.md)
176-
- [Performance Analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
176+
- [Performance Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
177177
- [Inference Request Tracing](user_guide/trace.md)
178178
### Jetson and JetPack
179179
Triton can be deployed on edge devices. Explore [resources](user_guide/jetson.md) and [examples](examples/jetson/README.md).
@@ -185,7 +185,7 @@ The following resources are recommended to explore the full suite of Triton Infe
185185

186186
- **Configuring Deployment**: Triton comes with three tools which can be used to configure deployment setting, measure performance and recommend optimizations.
187187
- [Model Analyzer](https://github.com/triton-inference-server/model_analyzer) Model Analyzer is CLI tool built to recommend deployment configurations for Triton Inference Server based on user's Quality of Service Requirements. It also generates detailed reports about model performance to summarize the benefits and trade offs of different configurations.
188-
- [Perf Analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md):
188+
- [Perf Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md):
189189
Perf Analyzer is a CLI application built to generate inference requests and
190190
measures the latency of those requests and throughput of the model being
191191
served.

docs/contents.md

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -119,17 +119,24 @@ client/src/grpc_generated/java/README
119119
:maxdepth: 1
120120
:caption: Performance Analyzer
121121
122-
client/src/c++/perf_analyzer/README
123-
client/src/c++/perf_analyzer/docs/README
124-
client/src/c++/perf_analyzer/docs/install
125-
client/src/c++/perf_analyzer/docs/quick_start
126-
client/src/c++/perf_analyzer/docs/cli
127-
client/src/c++/perf_analyzer/docs/inference_load_modes
128-
client/src/c++/perf_analyzer/docs/input_data
129-
client/src/c++/perf_analyzer/docs/measurements_metrics
130-
client/src/c++/perf_analyzer/docs/benchmarking
131-
client/src/c++/perf_analyzer/genai-perf/README
132-
client/src/c++/perf_analyzer/genai-perf/examples/tutorial
122+
perf_analyzer/README
123+
perf_analyzer/docs/README
124+
perf_analyzer/docs/install
125+
perf_analyzer/docs/quick_start
126+
perf_analyzer/docs/cli
127+
perf_analyzer/docs/inference_load_modes
128+
perf_analyzer/docs/input_data
129+
perf_analyzer/docs/measurements_metrics
130+
perf_analyzer/docs/benchmarking
131+
perf_analyzer/genai-perf/README
132+
perf_analyzer/genai-perf/docs/compare
133+
perf_analyzer/genai-perf/docs/embeddings
134+
perf_analyzer/genai-perf/docs/files
135+
perf_analyzer/genai-perf/docs/lora
136+
perf_analyzer/genai-perf/docs/multi_modal
137+
perf_analyzer/genai-perf/docs/rankings
138+
perf_analyzer/genai-perf/docs/tutorial
139+
perf_analyzer/genai-perf/examples/tutorial
133140
```
134141

135142
```{toctree}

docs/examples/jetson/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright (c) 2021-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright (c) 2021-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -53,7 +53,7 @@ Inference Server as a shared library.
5353
## Part 2. Analyzing model performance with perf_analyzer
5454

5555
To analyze model performance on Jetson,
56-
[perf_analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
56+
[perf_analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
5757
tool is used. The `perf_analyzer` is included in the release tar file or can be
5858
compiled from source.
5959

@@ -65,4 +65,4 @@ From this directory of the repository, execute the following to evaluate model p
6565

6666
In the example above we saved the results as a `.csv` file. To visualize these
6767
results, follow the steps described
68-
[here](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md).
68+
[here](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md).

docs/generate_docs.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,10 @@ def main():
388388
if "client" in repo_tags:
389389
clone_from_github("client", repo_tags["client"], github_org)
390390

391+
# Usage generate_docs.py --repo-tag=perf_analyzer:main
392+
if "perf_analyzer" in repo_tags:
393+
clone_from_github("perf_analyzer", repo_tags["perf_analyzer"], github_org)
394+
391395
# Usage generate_docs.py --repo-tag=python_backend:main
392396
if "python_backend" in repo_tags:
393397
clone_from_github("python_backend", repo_tags["python_backend"], github_org)

docs/user_guide/debugging_guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright 2023-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -59,7 +59,7 @@ Before proceeding, please see if the model configuration documentation [here](./
5959
- [Custom_models](https://github.com/triton-inference-server/server/tree/main/qa/custom_models), [ensemble_models](https://github.com/triton-inference-server/server/tree/main/qa/ensemble_models), and [python_models](https://github.com/triton-inference-server/server/tree/main/qa/python_models) include examples of configs for their respective use cases.
6060
- [L0_model_config](https://github.com/triton-inference-server/server/tree/main/qa/L0_model_config) tests many types of incomplete model configs.
6161

62-
Note that if you are running into an issue with [perf_analyzer](https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/perf_analyzer/README.md) or [Model Analyzer](https://github.com/triton-inference-server/model_analyzer), try loading the model onto Triton directly. This checks if the configuration is incorrect or the perf_analyzer or Model Analyzer options need to be updated.
62+
Note that if you are running into an issue with [perf_analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md) or [Model Analyzer](https://github.com/triton-inference-server/model_analyzer), try loading the model onto Triton directly. This checks if the configuration is incorrect or the perf_analyzer or Model Analyzer options need to be updated.
6363

6464
## Model Issues
6565
**Step 1. Run Models Outside of Triton**

docs/user_guide/faq.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright 2019-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# Copyright 2019-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -99,7 +99,7 @@ available through the [HTTP/REST, GRPC, and C
9999
APIs](../customization_guide/inference_protocols.md).
100100

101101
A client application,
102-
[perf_analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md),
102+
[perf_analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md),
103103
allows you to measure the performance of an individual model using a synthetic
104104
load. The perf_analyzer application is designed to show you the tradeoff of
105105
latency vs. throughput.

docs/user_guide/jetson.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ tritonserver --model-repository=/path/to/model_repo --backend-directory=/path/to
201201
```
202202

203203
**Note**:
204-
[perf_analyzer](https://github.com/triton-inference-server/client/blob/main/src/c++/perf_analyzer/README.md)
204+
[perf_analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
205205
is supported on Jetson, while the [model_analyzer](model_analyzer.md) is
206206
currently not available for Jetson. To execute `perf_analyzer` for C API, use
207207
the CLI flag `--service-kind=triton_c_api`:

0 commit comments

Comments
 (0)