Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions menu/navigation.json
Original file line number Diff line number Diff line change
Expand Up @@ -900,6 +900,10 @@
"label": "Monitor a deployment",
"slug": "monitor-deployment"
},
{
"label": "Scale a deployment",
"slug": "scale-deployment"
},
{
"label": "Manage allowed IP addresses",
"slug": "manage-allowed-ips"
Expand Down
10 changes: 7 additions & 3 deletions pages/managed-inference/how-to/create-deployment.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,16 @@ dates:
</Message>
- Choose the geographical **region** for the deployment.
- Specify the GPU Instance type to be used with your deployment.
4. Enter a **name** for the deployment, and optional tags.
5. Configure the **network connectivity** settings for the deployment:
4. Choose the number of nodes for your deployment.
<Message type="note">
High availability is only guaranteed with two or more nodes.
</Message>
5. Enter a **name** for the deployment, and optional tags.
6. Configure the **network connectivity** settings for the deployment:
- Attach to a **Private Network** for secure communication and restricted availability. Choose an existing Private Network from the drop-down list, or create a new one.
- Set up **Public connectivity** to access resources via the public internet. Authentication by API key is enabled by default.
<Message type="important">
- Enabling both private and public connectivity will result in two distinct endpoints (public and private) for your deployment.
- Deployments must have at least one endpoint, either public or private.
</Message>
6. Click **Deploy model** to launch the deployment process. Once the model is ready, it will be listed among your deployments.
7. Click **Deploy model** to launch the deployment process. Once the model is ready, it will be listed among your deployments.
37 changes: 37 additions & 0 deletions pages/managed-inference/how-to/scale-deployment.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
meta:
title: How to scale Managed Inference deployments
description: This page explains how to scale Managed Inference deployments in size
content:
h1: How to manage scale Managed Inference deployments
paragraph: This page explains how to scale Managed Inference deployments in size
tags: managed-inference ai-data ip-address
dates:
validation: 2025-06-03
posted: 2025-06-03
categories:
- ai-data
---

You can scale your Managed Inference deployment up or down to match it to the incoming load of your deployment.


<Macro id="requirements" />

- A Scaleway account logged into the [console](https://console.scaleway.com)
- A [Managed Inference deployment](/managed-inference/quickstart/)
- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization

## How to scale a Managed Inference deployement in size

1. Click **Managed Inference** in the **AI** section of the [Scaleway console](https://console.scaleway.com) side menu. A list of your deployments displays.
2. Click a deployment name or <Icon name="more" /> > **More info** to access the deployment dashboard.
3. Click the **Settings** tab and navigate to the **Scaling** section.
4. Click **Update node count** and adjust the number of nodes in your deployment.
<Message type="note">
High availability is only guaranteed with two or more nodes.
</Message>
5. Click **Update node count** to update the number of nodes in your deployment.
<Message type="note">
Your deployment will be unavailable for 15-30 minutes while the node update is in progress.
</Message>
Loading