-
Notifications
You must be signed in to change notification settings - Fork 80
Open
Description
Hi. I have built and run a model-analyzer image using the "Specific Version with Local Launch Mode" way (described here) as follows:
git clone https://github.com/triton-inference-server/model_analyzer.git -b r24.11cd ./model_analyzerdocker build --pull -t model-analyzer .docker run -it --gpus all -v /var/run/docker.sock:/var/run/docker.sock -v /home/qrf/workarea/elsayed/model_repository:/workspace/model_repo -v /home/qrf/workarea/elsayed/triton_outputs:/workspace/triton_outputs --net=host model-analyzer
The building ran smoothly without any errors. Then, I tried to run model-analyzer using the following command inside the container:
model-analyzer profile -f config.yaml
where config.yaml is:
# Path to the Triton Model Repository
model_repository: /workspace/model_repo
# List of the model names to be profiled <comma-delimited-string-list>
profile_models:
fingerprinting:
model_config_parameters:
dynamic_batching:
max_queue_delay_microseconds: [100, 200, 300]
parameters:
concurrency:
start: 2
stop: 10
step: 4
batch_sizes:
start: 1
stop: 10
step: 2
# ALL THE FOLLOWING FIELDS ARE OPTIONAL
# The directory to which the model analyzer will save model config variants
output_model_repository_path: ./results
# Allow model analyzer to overwrite contents of the output model repository
override_output_model_repository: true
# Export path to be used
export_path: ./profile_results
# Specifies the maximum number of retries for any retry attempt (default: 50)
client_max_retries: 5
# The protocol used to communicate with the Triton Inference Server. Only 'http' and 'grpc' are allowed for the values (default: grpc)
client_protocol: grpc
triton_launch_mode: docker
Then, I am getting the following error:
root@qrf-general:/workspace# model-analyzer profile -f config.yaml
[Model Analyzer] Initializing GPUDevice handles
[Model Analyzer] Using GPU 0 Tesla P4 with UUID GPU-811ab55d-aa6f-ec34-3bda-25a949fc9bf4
[Model Analyzer] WARNING: Overriding the output model repo path "./results"
[Model Analyzer] Starting a Triton Server using docker
[Model Analyzer] No checkpoint file found, starting a fresh run.
[Model Analyzer] Profiling server only metrics...
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/triton/client/client.py", line 60, in wait_for_server_ready
if self._client.is_server_ready():
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tritonclient/grpc/_client.py", line 344, in is_server_ready
raise_error_grpc(rpc_error)
File "/usr/local/lib/python3.12/dist-packages/tritonclient/grpc/_utils.py", line 77, in raise_error_grpc
raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: connect: Connection refused (111)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/model-analyzer", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/analyzer.py", line 130, in profile
self._get_server_only_metrics(client, gpus)
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/analyzer.py", line 229, in _get_server_only_metrics
client.wait_for_server_ready(
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/triton/client/client.py", line 72, in wait_for_server_ready
raise TritonModelAnalyzerException(e)
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: [StatusCode.UNAVAILABLE] failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: connect: Connection refused (111)
Metadata
Metadata
Assignees
Labels
No labels
