You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: faq/managed-inference.mdx
+10-4Lines changed: 10 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ It provides optimized infrastructure, customizable deployment options, and secur
19
19
All models are currently hosted in a secure data center located in Paris, France, operated by [OPCORE](https://www.opcore.com/). This ensures low latency for European users and compliance with European data privacy regulations.
20
20
21
21
## What is the difference between Managed Inference and Generative APIs?
22
-
-**Managed Inference**: Allows deployment of curated or custom models with chosen quantization and instances, offering predictable throughput and enhanced security features like private network isolation and access control.
22
+
-**Managed Inference**: Allows deployment of curated or custom models with chosen quantization and instances, offering predictable throughput and enhanced security features like private network isolation and access control. Managed Inference is billed by hourly usage, whether provisioned capacity is receiving traffic or not.
23
23
-**Generative APIs**: A serverless service providing access to pre-configured AI models via API, billed per token usage.
24
24
25
25
## Where can I find information regarding the data, privacy and security policies applied to Scaleway's AI services?
@@ -32,14 +32,15 @@ Managed Inference aims to achieve seamless compatibility with OpenAI APIs. You c
32
32
We are currently working on defining our SLAs for Managed Inference. We will provide more information on this topic soon.
33
33
34
34
## What are the performance guarantees (vs. Generative APIs)?
35
-
Managed Inference provides dedicated resources, ensuring predictable performance and lower latency compared to Generative APIs, which are a shared, serverless offering optimized for scalability. Managed Inference is ideal for workloads that require consistent response times, high availability, and custom hardware configurations.
35
+
Managed Inference provides dedicated resources, ensuring predictable performance and lower latency compared to Generative APIs, which are a shared, serverless offering optimized for infrequent traffic with moderate peak loads. Managed Inference is ideal for workloads that require consistent response times, high availability, custom hardware configurations or generate extreme peak loads during a narrow period of time.
36
+
Compared to Generative APIs, no usage quota are applied on the number of tokens per second generated, since the output is limited by the GPU Instances size and number of your Managed Inference Deployment.
36
37
37
38
## What types of models can I deploy with Managed Inference?
38
39
You can deploy a variety of models, including:
39
40
* Large language models (LLMs)
40
41
* Image processing models
41
42
* Audio recognition models
42
-
* Custom AI models
43
+
* Custom AI models (through API only yet)
43
44
Managed Inference supports both open-source models and proprietary models that you upload.
44
45
45
46
## How do I deploy a model using Managed Inference?
Yes, Managed Inference is designed for low-latency, high-throughput applications, making it suitable for real-time use cases such as chatbots, recommendation systems, fraud detection, and live video processing.
67
68
68
69
## Can I use Managed Inference with other Scaleway services?
69
-
Absolutely. Managed Inference integrates seamlessly with other Scaleway services, such as [Object Storage](/object-storage/quickstart/) for model hosting, [Kubernetes](/kubernetes/quickstart/) for containerized applications, and [Scaleway IAM](/iam/quickstart/) for access management.
70
+
Absolutely. Managed Inference integrates seamlessly with other Scaleway services, such as [Object Storage](/object-storage/quickstart/) for model hosting, [Kubernetes](/kubernetes/quickstart/) for containerized applications, and [Scaleway IAM](/iam/quickstart/) for access management.
71
+
72
+
## Do model license apply when using Managed Inference?
73
+
Yes, model licenses need to be complied with when using Managed Inference. Applicable licenses are available for [each model in our documentation](https://www.scaleway.com/en/docs/managed-inference/reference-content/).
74
+
For models provided in the Scaleway catalog, you need to accept licenses (including potential EULA) before creating any Managed Inference deployment.
75
+
For custom models you choose to import on Scaleway, you are responsible to comply with model licenses (as with any software you choose to install on a GPU Instance for example).
0 commit comments