Skip to content

Commit bdb8fab

Browse files
bene2k1RoRoJ
andauthored
feat(infr): add quantization configuration for custom model deployment (#5066)
* feat(infr): add quantization configuration for custom model deployment * Apply suggestions from code review Co-authored-by: Rowena Jones <[email protected]> --------- Co-authored-by: Rowena Jones <[email protected]>
1 parent 7197ee0 commit bdb8fab

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

pages/managed-inference/how-to/create-deployment.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,10 @@ dates:
2828
Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
2929
</Message>
3030
- Choose the geographical **region** for the deployment.
31+
- For custom models: Choose the model quantization.
32+
<Message type="tip">
33+
Each model comes with a default quantization. Select lower bits quantization to improve performance and enable the model to run on smaller GPU nodes, while potentially reducing precision.
34+
</Message>
3135
- Specify the GPU Instance type to be used with your deployment.
3236
5. Choose the number of nodes for your deployment. Note that this feature is currently in [Public Beta](https://www.scaleway.com/betas/).
3337
<Message type="note">

0 commit comments

Comments
 (0)