Skip to content

Commit f87d967

Browse files
committed
Proofreading
1 parent a2ae12e commit f87d967

File tree

1 file changed

+40
-41
lines changed

1 file changed

+40
-41
lines changed

pages/platform/ai/deploy_guide_07_troubleshooting/guide.en-gb.md

Lines changed: 40 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: AI Deploy - Troubleshooting
33
slug: deploy/debug-apps
4-
excerpt: Most popular questions and answer to troubleshoot your issues
4+
excerpt: Find here all the most popular questions and answers to troubleshoot your issues
55
section: AI Deploy - Guides
66
order: 05
77
updated: 2023-03-30
@@ -16,26 +16,26 @@ This page gives you a few hints on how to debug your apps if you encounter some
1616
## Requirements
1717

1818
- Access to the [OVHcloud Control Panel](https://www.ovh.com/auth/?action=gotomanager&from=https://www.ovh.co.uk/&ovhSubsidiary=GB)
19-
- A **Public Cloud** project
19+
- A [**Public Cloud** project](https://docs.ovh.com/gb/en/public-cloud/create_a_public_cloud_project/)
2020

2121
## Building your app
2222

2323
### Best practices and mandatory guidelines to build your app
2424

25-
When you are deploying your own applications and models, some guidelines must be followed. We detail them on the guide [AI Deploy - Build & use custom Docker image](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/build-use-custom-image/).
26-
Especially, be cautious about image requirements such as OVHcloud user and Docker architecture used. Otherwise, your deployment will end in `FAILED` status.
25+
When you are deploying your own applications and models, some guidelines must be followed. We detail them in the guide [AI Deploy - Build & use custom Docker image](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/build-use-custom-image/).
26+
Be particularly cautious about image requirements such as OVHcloud user and Docker architecture used. Otherwise, your deployment will end in `FAILED` status.
2727

2828
### Apps examples to follow
2929

30-
If you need some official examples, please follow this guide, where we share the source code: [AI Deploy - Apps portfolio](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/apps-portfolio/).
30+
If you need some official examples, please follow this guide where we share the source code: [AI Deploy - Apps portfolio](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/apps-portfolio/).
3131

3232
### Test your app locally and in the cloud
3333

3434
Before paying for cloud resources, feel free to test locally your Docker image. For that, simply install Docker on your local environment.
3535

3636
For the building step, as explained in the mandatory guidelines linked in the previous section, your Docker image has to support at least `linux/amd64` platform to be deployed correctly. Otherwise deployment will fail.
3737

38-
Then perform a `docker run` as follow:
38+
Then perform a `docker run` as follows:
3939

4040
```
4141
# Build your Docker image for at least linux/amd64 architecture
@@ -45,53 +45,52 @@ docker buildx build --platform linux/amd64,linux/arm64 ...
4545
docker run --rm -it --user=42420:42420 <image-identifier>
4646
```
4747

48-
This way, we will imitate the OVHcloud user. Once validated locally, you can deploy your app first with CPUs, who are cheaper compared to GPUs.
49-
48+
This way, you will imitate the OVHcloud user. Once validated locally, you can deploy your app first with CPUs which are cheaper compared to GPUs.
5049

5150
## Deployments
5251

5352
### My deployment has failed
5453

55-
An AI Deploy app has a workflow in multiple steps, and the `FAILED` status is one of them. This state happens when OVHcloud is unable to deploy your app, meaning the infrastructure side (backend) is working fine but something is broken on the image side. You can find more details about AI Deploy workflow in [AI Deploy - Billing and lifecycle](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/billing/)
54+
An AI Deploy app has a workflow in multiple steps, the `FAILED` status being one of them. This state happens when OVHcloud is unable to deploy your app, meaning the infrastructure side (backend) is working fine but something is broken on the image side. You can find more details about AI Deploy workflow on the [AI Deploy - Billing and lifecycle](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/billing/) page.
5655

5756
Main items to troubleshoot:
5857

59-
- Typography in your repository name, image or version name. Test to deploy your image locally first.
60-
- Your Docker image is not following mandatory guidelines, such as OVhcloud user. See [AI Deploy - Build & use custom Docker image](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/build-use-custom-image/)
61-
- Your Docker image is in a private registry, and you did not authorize OVHcloud to access it.
62-
- Your have reached your quotas in terms of CPUs or GPUs. You can check them via the control panel (Project Management / Quotas) or via the `ovhai CLI` command `ovhai me`.
58+
- Typography in your repository name, image or version name. Test deploying your image locally first.
59+
- Your Docker image is not following mandatory guidelines, such as OVHcloud user. See [AI Deploy - Build & use custom Docker image](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/build-use-custom-image/).
60+
- Your Docker image is in a private registry and you did not authorize OVHcloud to access it.
61+
- You have reached your quotas in terms of CPUs or GPUs. You can check them via the OVHcloud Control Panel (Project Management / Quotas) or via the `ovhai CLI` command `ovhai me`.
6362

64-
If you are using `ovhai CLI`, you can learn more information with the `ovhai debug` command, which will give you more details about your command, and `ovhai app logs <app_ID>` to download logs history.
63+
If you are using `ovhai CLI`, you can get more more details about your command with the `ovhai debug` command, and `ovhai app logs <app_ID>` to download logs history.
6564

6665
### My deployment is in error
6766

6867
While a deployment in `FAILED` state is due to a problem on the image, repository, etc., an app in `ERROR` state can occur when AI Deploy in encountering an issue.
6968

70-
Try to redeploy your app, and modify the targeted datacenter for example.
71-
As the previous question, when using our CLI, you can learn more information with the `ovhai debug` command, which will give you more details about your command, and `ovhai app logs <app_ID>` to download logs history.
69+
Try redeploying your app, and modify the targeted datacenter for example.
70+
As in the previous answer, when using our CLI you can get more more details about your command with the `ovhai debug` command, and `ovhai app logs <app_ID>` to download logs history.
7271

73-
If the issue persists, please contact our support.
72+
If the issue persists, please contact our support teams.
7473

7574
### My Deployment seems very long
7675

77-
When AI Deploy initialize your app, the Docker image is pulled (downloaded) in our infrastructure and replicated over the replicas, if any.
78-
The larger the Docker image is, the longer it will take to be deployed on AI Deploy side.
76+
When AI Deploy initializes your app, the Docker image is pulled (downloaded) in our infrastructure and replicated over the replicas, if any.
77+
The larger the Docker image is, the longer it will take to be deployed on AI Deploy side.
7978

80-
Also, since we pull the data from a registry of your choice, if this particular registry is experiencing some issue or is restricted in terms of bandwidth or throughput, it may cause some slowness.
79+
Also, since we pull the data from a registry of your choice, if this particular registry is experiencing some issues or is restricted in terms of bandwidth or throughput, it may cause some slowness.
8180

82-
In an ideal situation, for a Docker image sized approximately 1GB, without external data linked, it should take less than 10 minutes.
81+
In an ideal situation, for a Docker image of approximately 1GB, without external data linked, it should take less than 10 minutes.
8382

8483
### My deployed app does not scale
8584

8685
AI Deploy provides manual scaling and autoscaling, allowing you to scale up or down based on triggers such as CPU or RAM usages.
87-
More information on the official documentation about [scaling strategies](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/apps-deployments/).
86+
Find more information on the official documentation about [scaling strategies](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/apps-deployments/).
8887

8988
If your app does not scale:
9089

9190
- Check if you deployed your app with manual or autoscaling.
9291
- Verify triggers (CPU or RAM usage) and their value. By default the value is at 75%.
9392
- Open the Monitoring dashboard of your app (Grafana dashboard is provided for each app) and check if the threshold has been reached.
94-
- For load-testing tutorial and dashboard example to follow your scaling, you can refer to this tutorial: [AI Deploy - How to load test your application with Locust](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/load-test-app/).
93+
- Refer to the following load-testing tutorial which also provides a dashboard example to follow your scaling: [AI Deploy - How to load test your application with Locust](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/load-test-app/).
9594

9695

9796
### My deployed app is very slow
@@ -100,29 +99,29 @@ Slowness may find its roots in multiple reasons. Indeed, each deployed app is th
10099

101100
If you are experiencing slowness, here are some actions to investigate:
102101

103-
- Open the Monitoring dashboard for your app (Grafana dashboard is provided for each app) and check if you are reaching some resources to 90/100%, such as RAM, CPU, GPU or network. You can also check the overall latency.
104-
- If nothing is visible, it can be an issue between the client (where the query comes) and the deployed app. As an example, if you are contacting your apps from a geographically distant point, it will add latency. Try to reduce the distances in your architecture.
105-
- Your Docker image itself may be the root cause. Try to run your Docker image locally, and query your app locally. Some apps might be heavy to run or not well optimized.
102+
- Open the Monitoring dashboard for your app (Grafana dashboard is provided for each app) and check if some resources are reaching 90/100%, such as RAM, CPU, GPU or network. You can also check the overall latency.
103+
- If nothing is visible, it can be an issue between the client (where the query comes) and the deployed app. As an example, if you are contacting your apps from a geographically distant point, it will add latency. Try reducing the distances in your architecture.
104+
- Your Docker image itself may be the root cause. Try running your Docker image locally, and query your app locally. Some apps might be heavy to run or not well optimized.
106105

107106
### My deployment has crashed
108107

109-
Like any cloud product, AI Deploy might experience hardware or software failures over time. To mitigate the risk on your side, please deploy you app on at least two replicas, allowing us to provide high availability. At this time, all replicas are in the same region, but it will prevent from a physical server failure.
108+
Like any cloud product, AI Deploy might experience hardware or software failures over time. To mitigate the risk on your side, please deploy your app on at least two replicas, allowing us to provide high availability. At this time, all replicas are in the same region, but it will prevent them from a physical server failure.
110109

111-
Another root cause may be your own Docker image, for example by writing uncontrolled amount of data into your working directory.
110+
Another root cause may be your own Docker image, for example by writing an uncontrolled amount of data into your working directory.
112111

113-
Also, we recommend to orchestrate your workflow with third party tools such as Airflow, Prefect, Dagster or Kestra, allowing you to relaunch an app once it has crashed.
112+
We also recommend orchestrating your workflow with third party tools such as Airflow, Prefect, Dagster or Kestra, allowing you to relaunch an app once it has crashed.
114113

115-
If your app crashed and you are using `ovhai CLI`, you can learn more information with `ovhai app logs <app_ID>` to download logs history.
114+
If your app crashed and you are using `ovhai CLI`, you can get more information with `ovhai app logs <app_ID>` to download logs history.
116115

117116
### My data is not synchronized back
118117

119-
AI Deploy does not synchronize back your remote data. Please follow [official guideline to build & use custom Docker image](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/build-use-custom-image/).
118+
AI Deploy does not synchronize back your remote data. Please follow [official guidelines to build & use custom Docker image](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/build-use-custom-image/).
120119

121120
## Connectivity
122121

123122
### I don't understand how I can connect to my app
124123

125-
AI Deploy provides an HTTP endpoint for each deployed app. You can find your endpoint via OVHcloud control panel (*Public Cloud / AI Deploy / My app / Access URL*), API or CLI.
124+
AI Deploy provides an HTTP endpoint for each deployed app. You can find your endpoint via the OVHcloud control panel (*Public Cloud / AI Deploy / My app / Access URL*), API or CLI.
126125

127126
An HTTP endpoint will look like this: `https://<unique_id>.app.gra.ai.cloud.ovh.net`
128127

@@ -132,9 +131,9 @@ Depending on what you deployed, you then just have an API endpoint or a web inte
132131

133132
### I'm unable to connect (unauthorized)
134133

135-
When you deploy an app, you can opt for unrestricted access (open to the internet) or secured access.
134+
When you deploy an app, you can opt for unrestricted access (open to the internet) or secured access.
136135

137-
While unrestricted access means that everyone is authorized, a secured access will require credentials. Two ways are available:
136+
While unrestricted access means that everyone is authorized, a secured access will require credentials. Two options are available:
138137

139138
- An AI user. It can be seen as a user and password restriction. Quite simple but not a lot of granularity.
140139
- An AI token (preferred solution). A token is very effective since you can link them with labels. For example, a token for a specific app ID, for a team, ...
@@ -143,22 +142,22 @@ If you selected a restricted access, don't forget to [generate an applicative to
143142

144143
### I need more than one port to be exposed
145144

146-
By design, AI Deploy links your app to one HTTP endpoint and one port (default is 8080). If you need more than one port, best practice is to split you deployment in multiple apps.
147-
If you cannot afford it, you can tweak your HTTP endpoint as follow: `https://<unique_id>-<specific_port>.app.<region>.ai.cloud.ovh.net`.
145+
By design, AI Deploy links your app to one HTTP endpoint and one port (default is 8080). If you need more than one port, best practice is to split your deployment in multiple apps.
146+
If you cannot afford it, you can tweak your HTTP endpoint as follows: `https://<unique_id>-<specific_port>.app.<region>.ai.cloud.ovh.net`.
148147

149148
For example, just add `-8000` after your unique ID and you will be routed to this specific port.
150149

151150
## Billing
152151

153-
### I don't understand how it will cost to deploy an app
152+
### I don't understand how much it will cost to deploy an app
154153

155-
AI Deploy pricing model is quite simple compared to competitors. You pay for the compute resources (CPUs/GPUs) during the lap of time you will use them.
154+
The AI Deploy pricing model is quite simple compared to competitors. You pay for the compute resources (CPUs/GPUs) during the lap of time you will use them.
156155

157-
Basic example : If you deploy one app with 2 x GPU at 1 euro each for 6 hours, you will pay 12 euros at the end. (2 x 1€ x 6h), whatever the amount of calls or users received.
156+
- Basic example : If you deploy one app with 2 x GPU at 1 euro each for 6 hours, you will pay 12 euros at the end. (2 x 1€ x 6h), whatever the amount of calls or users received.
158157

159-
Prices are shown statically in our [official website](http://www.ovhcloud.com), inside our Public Cloud section. For dynamic estimation, use our control panel. An estimation will be available before launching a deployment.
158+
Prices are shown statically on our [official website](http://www.ovhcloud.com), inside our Public Cloud section. For a dynamic estimation, use the OVHcloud Control Panel. An estimation will be available before launching a deployment.
160159

161-
Also, for more detailed information, please refer to [AI Deploy - Billing and lifecycle](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/billing/).
160+
Also, for more detailed information, please refer to our [AI Deploy - Billing and lifecycle](https://docs.ovh.com/gb/en/publiccloud/ai/deploy/billing/) page.
162161

163162
### I'm unable to get a "pay per call" deployment
164163

0 commit comments

Comments
 (0)