feat(infr): add quantization configuration for custom model deployment (#5066)

bene2k1 · RoRoJ · web-flow · commit bdb8fab8f8cd · 2025-07-02T10:50:07.000+02:00
* feat(infr): add quantization configuration for custom model deployment

* Apply suggestions from code review

Co-authored-by: Rowena Jones &lt;36301604+RoRoJ@users.noreply.github.com&gt;

---------

Co-authored-by: Rowena Jones &lt;36301604+RoRoJ@users.noreply.github.com&gt;
diff --git a/pages/managed-inference/how-to/create-deployment.mdx b/pages/managed-inference/how-to/create-deployment.mdx
@@ -28,6 +28,10 @@ dates:
           Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
         </Message>
     - Choose the geographical **region** for the deployment.
+    - For custom models: Choose the model quantization.
+      <Message type="tip">
+        Each model comes with a default quantization. Select lower bits quantization to improve performance and enable the model to run on smaller GPU nodes, while potentially reducing precision.
+      </Message>
     - Specify the GPU Instance type to be used with your deployment.
 5. Choose the number of nodes for your deployment. Note that this feature is currently in [Public Beta](https://www.scaleway.com/betas/).
     <Message type="note">