-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Describe the bug
I am following the guide for deploying a managed online endpoint here: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=cli
When creation the deployment locally via "az ml online-deployment create --local ...", the deployment creates, but into a failed state. Running "az ml online-deployment get-logs ..." shows the following error:
"FileNotFoundError: [Errno 2] No such file or directory: '/var/azureml-app/workspace/online_ep_scoring.py'"
"workspace" is the name of the folder that is the working directory during the CLI command calls, and the location of my scoring script, online_ep_scoring.py. My deployment yaml definition has the following content in code_configuration:
code_configuration:
code: ./
scoring_script: online_ep_scoring.py
Related command
az ml online-deployment create --local -n deployment1 --endpoint <ep_name> -f deployment.yaml --resource-group --workspace-name
Errors
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 77, in load_script
main_module_spec.loader.exec_module(user_module)
File "", line 846, in exec_module
File "", line 982, in get_code
File "", line 1039, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/var/azureml-app/workspace/online_ep_scoring.py'
Issue script & Debug output
Building Docker image from DockerfileDEBUG: docker.api.build: Looking for auth config
DEBUG: docker.api.build: Sending auth config ()
......DEBUG: urllib3.connectionpool: http://localhost:None "POST /v1.47/build?t=tag840-test2%3Adeployment14&q=False&nocache=False&rm=False&forcerm=False&pull=True&dockerfile=Dockerfile HTTP/1.1" 200 None
Step 1/6 : FROM mcr.microsoft.com/azureml/curated/python-sdk-v2:latest
---> 625be0a05955
Step 2/6 : RUN mkdir -p /var/azureml-app/
---> Using cache
---> 854146e42b06
Step 3/6 : WORKDIR /var/azureml-app/
---> Using cache
---> 04e52853036d
Step 4/6 : COPY conda.yml /var/azureml-app/
---> Using cache
---> 80fc903c19a7
Step 5/6 : RUN conda env create -n inf-conda-env --file conda.yml
---> Using cache
---> e6e31f22e206
Step 6/6 : CMD ["conda", "run", "--no-capture-output", "-n", "inf-conda-env", "runsvdir", "/var/runit"]
---> Using cache
---> 3e4ee858db3c
Successfully built 3e4ee858db3c
Successfully tagged tag840-test2:deployment14
Starting up endpointDEBUG: urllib3.connectionpool: http://localhost:None "GET /v1.47/containers/json?limit=-1&all=1&size=0&trunc_cmd=0&filters=%7B%22label%22%3A+%5B%22azureml-local-endpoint%22%2C+%22endpoint%3Dtag840-test2%22%5D%7D HTTP/1.1" 200 None
DEBUG: urllib3.connectionpool: http://localhost:None "GET /v1.47/containers/ec61f14041a060385b9aa09cb3371048bcea1bcf291b4f6d9b93465c6d9ecd36/json HTTP/1.1" 200 None
DEBUG: urllib3.connectionpool: http://localhost:None "POST /v1.47/containers/ec61f14041a060385b9aa09cb3371048bcea1bcf291b4f6d9b93465c6d9ecd36/stop HTTP/1.1" 304 0
DEBUG: urllib3.connectionpool: http://localhost:None "DELETE /v1.47/containers/ec61f14041a060385b9aa09cb3371048bcea1bcf291b4f6d9b93465c6d9ecd36?v=False&link=False&force=False HTTP/1.1" 204 0
DEBUG: urllib3.connectionpool: http://localhost:None "POST /v1.47/containers/create?name=tag840-test2.deployment14 HTTP/1.1" 201 None
DEBUG: urllib3.connectionpool: http://localhost:None "GET /v1.47/containers/7fe0811341057e3087a21d7f76269aa8e3e9d07795a452f27d4e99a443c6df30/json HTTP/1.1" 200 None
DEBUG: urllib3.connectionpool: http://localhost:None "POST /v1.47/containers/7fe0811341057e3087a21d7f76269aa8e3e9d07795a452f27d4e99a443c6df30/start HTTP/1.1" 204 0
...DEBUG: urllib3.connectionpool: http://localhost:None "GET /v1.47/containers/7fe0811341057e3087a21d7f76269aa8e3e9d07795a452f27d4e99a443c6df30/json HTTP/1.1" 200 None
Done (6m 45s)
Traceback (most recent call last):
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/custom/online_deployment.py", line 118, in ml_online_deployment_create
deployment = ml_client.begin_create_or_update(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_ml_client.py", line 1292, in begin_create_or_update
return _begin_create_or_update(entity, self._operation_container.all_operations, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/functools.py", line 909, in wrapper
return dispatch(args[0].class)(*args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_ml_client.py", line 1404, in _
return operations[AzureMLResourceType.ONLINE_DEPLOYMENT].begin_create_or_update(entity, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/site-packages/azure/core/tracing/decorator.py", line 94, in wrapper_use_tracer
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_telemetry/activity.py", line 288, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_online_deployment_operations.py", line 218, in begin_create_or_update
raise ex
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_online_deployment_operations.py", line 144, in begin_create_or_update
return self._local_deployment_helper.create_or_update(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_local_deployment_helper.py", line 105, in create_or_update
raise ex
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_local_deployment_helper.py", line 90, in create_or_update
local_endpoint_polling_wrapper(
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_utils/_endpoint_utils.py", line 103, in local_endpoint_polling_wrapper
return event.result()
^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/opt/az/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_local_deployment_helper.py", line 300, in _create_deployment
self._docker_client.create_deployment(
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_local_endpoints/docker_client.py", line 230, in create_deployment
_validate_container_state(
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_local_endpoints/docker_client.py", line 568, in _validate_container_state
raise LocalEndpointInFailedStateError(endpoint_name=endpoint_name, deployment_name=deployment_name)
azext_mlv2.manual.vendored_curated_sdk.azure.ai.ml.exceptions.LocalEndpointInFailedStateError: Local deployment (tag840-test2 / deployment14) is in failed state. Try getting logs to debug scoring script.
ERROR: cli: None
DEBUG: cli.azure.cli.core.azclierror: Traceback (most recent call last):
File "/opt/az/lib/python3.12/site-packages/knack/cli.py", line 233, in invoke
cmd_result = self.invocation.execute(args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/site-packages/azure/cli/core/commands/init.py", line 666, in execute
raise ex
File "/opt/az/lib/python3.12/site-packages/azure/cli/core/commands/init.py", line 734, in _run_jobs_serially
results.append(self._run_job(expanded_arg, cmd_copy))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/site-packages/azure/cli/core/commands/init.py", line 703, in _run_job
result = cmd_copy(params)
^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/site-packages/azure/cli/core/commands/init.py", line 336, in call
return self.handler(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/site-packages/azure/cli/core/commands/command_operation.py", line 120, in handler
return op(**command_args)
^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/custom/online_deployment.py", line 132, in ml_online_deployment_create
log_and_raise_error(err, debug, yaml_operation=yaml_operation)
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/custom/raise_error.py", line 185, in log_and_raise_error
raise cli_error
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/custom/online_deployment.py", line 118, in ml_online_deployment_create
deployment = ml_client.begin_create_or_update(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_ml_client.py", line 1292, in begin_create_or_update
return _begin_create_or_update(entity, self._operation_container.all_operations, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/functools.py", line 909, in wrapper
return dispatch(args[0].class)(*args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_ml_client.py", line 1404, in _
return operations[AzureMLResourceType.ONLINE_DEPLOYMENT].begin_create_or_update(entity, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/site-packages/azure/core/tracing/decorator.py", line 94, in wrapper_use_tracer
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_telemetry/activity.py", line 288, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_online_deployment_operations.py", line 218, in begin_create_or_update
raise ex
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_online_deployment_operations.py", line 144, in begin_create_or_update
return self._local_deployment_helper.create_or_update(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_local_deployment_helper.py", line 105, in create_or_update
raise ex
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_local_deployment_helper.py", line 90, in create_or_update
local_endpoint_polling_wrapper(
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_utils/_endpoint_utils.py", line 103, in local_endpoint_polling_wrapper
return event.result()
^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/az/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/opt/az/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/operations/_local_deployment_helper.py", line 300, in _create_deployment
self._docker_client.create_deployment(
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_local_endpoints/docker_client.py", line 230, in create_deployment
_validate_container_state(
File "/home/vscode/.azure/cliextensions/ml/azext_mlv2/manual/vendored_curated_sdk/azure/ai/ml/_local_endpoints/docker_client.py", line 568, in _validate_container_state
raise LocalEndpointInFailedStateError(endpoint_name=endpoint_name, deployment_name=deployment_name)
azext_mlv2.manual.vendored_curated_sdk.azure.ai.ml.exceptions.LocalEndpointInFailedStateError: Local deployment (tag840-test2 / deployment14) is in failed state. Try getting logs to debug scoring script.
ERROR: cli.azure.cli.core.azclierror: Local deployment (tag840-test2 / deployment14) is in failed state. Try getting logs to debug scoring script.
ERROR: az_command_data_logger: Local deployment (tag840-test2 / deployment14) is in failed state. Try getting logs to debug scoring script.
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7f81b1e57060>]
INFO: az_command_data_logger: exit code: 1
INFO: cli.main: Command ran in 407.240 seconds (init: 0.100, invoke: 407.140)
INFO: telemetry.main: Begin splitting cli events and extra events, total events: 1
INFO: telemetry.client: Accumulated 0 events. Flush the clients.
INFO: telemetry.main: Finish splitting cli events and extra events, cli events: 1
INFO: telemetry.save: Save telemetry record of length 4567 in cache file under /home/vscode/.azure/telemetry/20250228174800094
INFO: telemetry.main: Begin creating telemetry upload process.
INFO: telemetry.process: Creating upload process: "/opt/az/bin/python3 /opt/az/lib/python3.12/site-packages/azure/cli/telemetry/init.py /home/vscode/.azure /home/vscode/.azure/telemetry/20250228174800094"
INFO: telemetry.process: Return from creating process 45065
INFO: telemetry.main: Finish creating telemetry upload process.
Expected behavior
The deployment is created successfully without error.
Environment Summary
azure-cli 2.69.0
core 2.69.0
telemetry 1.1.0
Extensions:
ml 2.35.0
Dependencies:
msal 1.31.2b1
azure-mgmt-resource 23.1.1
Python location '/opt/az/bin/python3'
Config directory '/home/vscode/.azure'
Extensions directory '/home/vscode/.azure/cliextensions'
Python (Linux) 3.12.8 (main, Feb 5 2025, 06:39:06) [GCC 10.2.1 20210110]
Additional context
this is my deployment yaml definition:
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: deployment14
endpoint_name: tag840-test2
model: azureml:MODEL:1
code_configuration:
code: ./
scoring_script: online_ep_scoring.py
environment:
conda_file: ./conda.yaml
image: mcr.microsoft.com/azureml/curated/python-sdk-v2:latest
instance_count: 1
instance_type: Standard_DS3_v2