You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/managed-inference/how-to/create-deployment.mdx
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,12 +28,16 @@ dates:
28
28
</Message>
29
29
- Choose the geographical **region** for the deployment.
30
30
- Specify the GPU Instance type to be used with your deployment.
31
-
4. Enter a **name** for the deployment, and optional tags.
32
-
5. Configure the **network connectivity** settings for the deployment:
31
+
4. Choose the number of nodes for your deployment.
32
+
<Messagetype="note">
33
+
High availability is only guaranteed with two or more nodes.
34
+
</Message>
35
+
5. Enter a **name** for the deployment, and optional tags.
36
+
6. Configure the **network connectivity** settings for the deployment:
33
37
- Attach to a **Private Network** for secure communication and restricted availability. Choose an existing Private Network from the drop-down list, or create a new one.
34
38
- Set up **Public connectivity** to access resources via the public internet. Authentication by API key is enabled by default.
35
39
<Messagetype="important">
36
40
- Enabling both private and public connectivity will result in two distinct endpoints (public and private) for your deployment.
37
41
- Deployments must have at least one endpoint, either public or private.
38
42
</Message>
39
-
6. Click **Deploy model** to launch the deployment process. Once the model is ready, it will be listed among your deployments.
43
+
7. Click **Deploy model** to launch the deployment process. Once the model is ready, it will be listed among your deployments.
description: This page explains how to scale Managed Inference deployments in size
5
+
content:
6
+
h1: How to manage scale Managed Inference deployments
7
+
paragraph: This page explains how to scale Managed Inference deployments in size
8
+
tags: managed-inference ai-data ip-address
9
+
dates:
10
+
validation: 2025-06-03
11
+
posted: 2025-06-03
12
+
categories:
13
+
- ai-data
14
+
---
15
+
16
+
You can scale your Managed Inference deployment up or down to match it to the incoming load of your deployment.
17
+
18
+
19
+
<Macroid="requirements" />
20
+
21
+
- A Scaleway account logged into the [console](https://console.scaleway.com)
22
+
- A [Managed Inference deployment](/managed-inference/quickstart/)
23
+
-[Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization
24
+
25
+
## How to scale a Managed Inference deployement in size
26
+
27
+
1. Click **Managed Inference** in the **AI** section of the [Scaleway console](https://console.scaleway.com) side menu. A list of your deployments displays.
28
+
2. Click a deployment name or <Iconname="more" /> > **More info** to access the deployment dashboard.
29
+
3. Click the **Settings** tab and navigate to the **Scaling** section.
30
+
4. Click **Change node number** and adjust the number of nodes in your deployment.
31
+
<Messagetype="note">
32
+
High availability is only guaranteed with two or more nodes.
33
+
</Message>
34
+
5. Click **Update node type** to update the numbmer of nodes in your deployment.
35
+
<Messagetype="note">
36
+
Note that your deployment will be unavailable for 15-30 minutes while the node update is in progress
0 commit comments