-
Notifications
You must be signed in to change notification settings - Fork 80
Open
Description
hi,
i run into the following error when running model_analyzer:
[Model Analyzer] Model add_sub_config_default load failed: [StatusCode.INTERNAL] failed to load 'add_sub', failed to poll from model repository
it fails for both cases:
- --model-repository /workspace/examples/quick-start
- --model-repository /workspace/examples/quick-start/add_sub
I followed this tutorial: https://github.com/triton-inference-server/model_analyzer/blob/r24.12/docs/quick_start.md
Steps to reproduce the issue:
mkdir model_analyzer_test && cd model_analyzer_test
git init && git remote add -f origin https://github.com/triton-inference-server/model_analyzer.git
git config core.sparseCheckout true && \
echo 'examples' >> .git/info/sparse-checkout && \
git pull origin main
docker pull nvcr.io/nvidia/tritonserver:24.12-py3-sdk
docker run -it --gpus all \
-v /var/run/docker.sock:/var/run/docker.sock \
-v $(pwd)/examples/quick-start:/workspace/examples/quick-start \
--net=host nvcr.io/nvidia/tritonserver:24.12-py3-sdk
# we are inside the docker now
mkdir ./output
model-analyzer profile \
--model-repository /workspace/examples/quick-start \
--profile-models add_sub --triton-launch-mode=docker \
--output-model-repository-path output/add_sub_output \
--export-path profile_results
model-analyzer profile --model-repository /workspace/examples/quick-start --profile-models add_sub --triton-launch-mode=docker --output-model-repository-path output/add_sub_output --export-path profile_results
[Model Analyzer] Call to cuInit results in CUDA_ERROR_NO_DEVICE
[Model Analyzer] Starting a Triton Server using docker
[Model Analyzer] Loaded checkpoint from file /workspace/checkpoints/0.ckpt
[Model Analyzer] GPU devices match checkpoint - skipping server metric acquisition
[Model Analyzer]
[Model Analyzer] Starting automatic brute search
[Model Analyzer]
[Model Analyzer] Creating model config: add_sub_config_default
[Model Analyzer]
[Model Analyzer] Model add_sub_config_default load failed: [StatusCode.INTERNAL] failed to load 'add_sub', failed to poll from model repository
[Model Analyzer] Saved checkpoint to /workspace/checkpoints/1.ckpt
[Model Analyzer] Creating model config: add_sub_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_CPU'}]
[Model Analyzer] Setting max_batch_size to 1
[Model Analyzer] Enabling dynamic_batching
[Model Analyzer]
[Model Analyzer] Model add_sub_config_0 load failed: [StatusCode.INTERNAL] failed to load 'add_sub', failed to poll from model repository
[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
Traceback (most recent call last):
File "/usr/local/bin/model-analyzer", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/analyzer.py", line 131, in profile
self._profile_models()
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/analyzer.py", line 251, in _profile_models
self._model_manager.run_models(models=[model])
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/model_manager.py", line 157, in run_models
self._stop_ma_if_no_valid_measurement_threshold_reached()
File "/usr/local/lib/python3.12/dist-packages/model_analyzer/model_manager.py", line 251, in _stop_ma_if_no_valid_measurement_threshold_reached
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.
docker logs of the triton inference server:
docker logs ac3cbc35ddec
=============================
== Triton Inference Server ==
=============================
NVIDIA Release 24.11 (build 124543091)
Triton Server Version 2.52.0
Copyright (c) 2018-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .
Warning: '--strict-model-config' has been deprecated! Please use '--disable-auto-complete-config' instead.
W0416 04:18:12.745911 7 pinned_memory_manager.cc:273] "Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version"
I0416 04:18:12.745946 7 cuda_memory_manager.cc:117] "CUDA memory pool disabled"
E0416 04:18:12.746011 7 server.cc:241] "CudaDriverHelper has not been initialized."
I0416 04:18:12.746193 7 server.cc:604]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0416 04:18:12.746213 7 server.cc:631]
+---------+------+--------+
| Backend | Path | Config |
+---------+------+--------+
+---------+------+--------+
I0416 04:18:12.746222 7 server.cc:674]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+
I0416 04:18:12.746391 7 metrics.cc:783] "Collecting CPU metrics"
I0416 04:18:12.746484 7 tritonserver.cc:2598]
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.52.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging |
| model_repository_path[0] | /workspace/examples/quick-start |
| model_control_mode | MODE_EXPLICIT |
| strict_model_config | 0 |
| model_config_name | |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0416 04:18:12.748541 7 grpc_server.cc:2558] "Started GRPCInferenceService at 0.0.0.0:8001"
I0416 04:18:12.748725 7 http_server.cc:4729] "Started HTTPService at 0.0.0.0:8000"
I0416 04:18:12.789778 7 http_server.cc:362] "Started Metrics Service at 0.0.0.0:8002"
E0416 04:18:18.666092 7 model_repository_manager.cc:1415] "failed to poll model 'add_sub': model not found in any model repository."
Signal (15) received.
I0416 04:18:18.675493 7 server.cc:305] "Waiting for in-flight requests to complete."
I0416 04:18:18.675512 7 server.cc:321] "Timeout 30: Found 0 model versions that have in-flight inferences"
I0416 04:18:18.675524 7 server.cc:336] "All models are stopped, unloading models"
I0416 04:18:18.675531 7 server.cc:345] "Timeout 30: Found 0 live models and 0 in-flight non-inference requests"
Metadata
Metadata
Assignees
Labels
No labels