Skip to content

Commit b369722

Browse files
authored
Merge pull request #94139 from snowei/patch-9
add azureml-fe tsg for kubernetes online endpoint
2 parents 060a7ca + 891f24a commit b369722

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

articles/machine-learning/how-to-troubleshoot-online-endpoints.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,7 @@ Below is a list of reasons you might run into this error:
200200
* [Startup task failed due to incorrect role assignments on resource](#authorization-error)
201201
* [Unable to download user container image](#unable-to-download-user-container-image)
202202
* [Unable to download user model or code artifacts](#unable-to-download-user-model-or-code-artifacts)
203+
* [azureml-fe for kubernetes online endpoint is not ready](#azureml-fe-not-ready)
203204

204205
#### Resource requests greater than limits
205206

@@ -245,6 +246,14 @@ Make sure model and code artifacts are registered to the same workspace as the d
245246

246247
`az storage blob exists --account-name foobar --container-name 210212154504-1517266419 --name WebUpload/210212154504-1517266419/GaussianNB.pkl --subscription <sub-name>`
247248

249+
#### azureml-fe not ready
250+
The front-end component (azureml-fe) that routes incoming inference requests to deployed services automatically scales as needed. It's installed during your k8s-extension installation.
251+
252+
This component should be healthy on cluster, at least one healthy replica. You will get this error message if it's not avaliable when you trigger kubernetes online endpoint and deployment creation/update request.
253+
254+
Please check the pod status and logs to fix this issue, you can also try to update the k8s-extension intalled on the cluster.
255+
256+
248257
### ERROR: ResourceNotReady
249258

250259
To run the `score.py` provided as part of the deployment, Azure creates a container that includes all the resources that the `score.py` needs, and runs the scoring script on that container. The error in this scenario is that this container is crashing when running, which means scoring can't happen. This error happens when:

0 commit comments

Comments
 (0)