-
Notifications
You must be signed in to change notification settings - Fork 133
feat(inference): migration v1 and support BYOM #3048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
dae9700
feat(inference): migration deployment to v1
Laure-di 3359869
feat(inference): BYOM support
Laure-di e401f22
add sweeper
Laure-di 8bc4617
manage id from regional or not
Laure-di 533f8e8
remove error_message
Laure-di 7512e43
fix linter
Laure-di 4f129e4
last cassette
Laure-di 9beb5b4
fix documentation
Laure-di 40a9392
fix documentation lint
Laure-di c910960
remove comment
Laure-di c53fd23
Update docs/resources/inference_custom_model.md
Laure-di 54e9077
change model_id format
Laure-di 7233d94
use of dsf.locality
Laure-di dbb2b2e
ResourceCustomModelDelete return right err and testAccCheckCustomMode…
Laure-di cadc3f2
fix(doc): add import part and fix typo
Laure-di 91113f7
fix(doc): deployment required attribute
Laure-di b02faf7
fix(inference): use of existing function cast
Laure-di 8c62809
Update docs/resources/inference_custom_model.md
Laure-di 2c94ded
skip tests until further notice
Laure-di 17b9663
activate tests
Laure-di fae6676
fix(inference): rename resource from custom_model to model
Laure-di 92f7cfd
update sdk-go
Laure-di c95dc22
remove unecessary file
Laure-di 8755556
fix(doc): put real URL and more context
Laure-di f890be1
add support model data-source
Laure-di d2c516c
testing
Laure-di c6b5ec7
add test
Laure-di 7ac52b7
update doc and tests
Laure-di 1f93d2e
fix linter
Laure-di fc80ab4
fix linter
Laure-di aa15cf5
remove custom reference
Laure-di f6fa71d
update cassette
Laure-di 715ed4e
update cassette deployment with datasource
Laure-di cfaa353
Merge branch 'master' into migration-inference-v1
remyleone 2720cd0
Merge branch 'master' into migration-inference-v1
remyleone File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| --- | ||
| subcategory: "inference" | ||
| page_title: "Scaleway: scaleway_inference_model" | ||
| --- | ||
|
|
||
| # scaleway_inference_model | ||
|
|
||
| The `scaleway_inference_model` data source allows you to retrieve information about an inference model available in the Scaleway Inference API, either by providing the model's `name` or its `model_id`. | ||
|
|
||
| ## Example Usage | ||
|
|
||
| ### Basic | ||
|
|
||
| ```hcl | ||
| data "scaleway_inference_model" "my_model" { | ||
| name = "meta/llama-3.1-8b-instruct:fp8" | ||
| } | ||
| ``` | ||
|
|
||
| ## Argument Reference | ||
|
|
||
| You must provide either name or model_id, but not both. | ||
|
|
||
| - `name` (Optional, Conflicts with model_id) The fully qualified name of the model to look up (e.g., "meta/llama-3.1-8b-instruct:fp8"). The provider will search for a model with an exact name match in the selected region and project. | ||
| - `model_id` (Optional, Conflicts with name) The ID of the model to retrieve. Must be a valid UUID with locality (i.e., Scaleway's zoned UUID format). | ||
| - `project_id` (Optional) The project ID to use when listing models. If not provided, the provider default project is used. | ||
| - `region` (Optional) The region where the model is hosted. If not set, the provider default region is used. | ||
|
|
||
| ## Attributes Reference | ||
|
|
||
| In addition to the input arguments above, the following attributes are exported: | ||
|
|
||
| - `id` - The unique identifier of the model. | ||
| - `tags` - Tags associated with the model. | ||
| - `status` - The current status of the model (e.g., ready, error, etc.). | ||
| - `description` - A textual description of the model (if available). | ||
| - `has_eula` - Whether the model requires end-user license agreement acceptance before use. | ||
| - `parameter_size_bits` - Size, in bits, of the model parameters. | ||
| - `size_bytes` - Total size, in bytes, of the model archive. | ||
| - `nodes_support` - List of supported node types and their quantization options. Each entry contains: | ||
| - `node_type_name` - The type of node supported. | ||
| - `quantization` - A list of supported quantization options, including: | ||
| - `quantization_bits` - Number of bits used for quantization (e.g., 8, 16). | ||
| - `allowed` - Whether this quantization is allowed. | ||
| - `max_context_size` - Maximum context length supported by this quantization. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| --- | ||
| subcategory: "Inference" | ||
| page_title: "Scaleway: scaleway_inference_model" | ||
| --- | ||
|
|
||
| # Resource: scaleway_inference_model | ||
|
|
||
| The scaleway_inference_model resource allows you to upload and manage inference models in the Scaleway Inference ecosystem. Once registered, a model can be used in any scaleway_inference_deployment resource. | ||
|
|
||
| ## Example Usage | ||
|
|
||
| ### Basic | ||
|
|
||
| ```terraform | ||
| resource "scaleway_inference_model" "test" { | ||
| name = "my-awesome-model" | ||
| url = "https://huggingface.co/agentica-org/DeepCoder-14B-Preview" | ||
| secret = "my-secret-token" | ||
| } | ||
| ``` | ||
|
|
||
| ### Deploy your own model on your managed inference | ||
|
|
||
| ```terraform | ||
| resource "scaleway_inference_model" "my_model" { | ||
| name = "my-awesome-model" | ||
| url = "https://huggingface.co/agentica-org/DeepCoder-14B-Preview" | ||
| secret = "my-secret-token" | ||
| } | ||
|
|
||
| resource "scaleway_inference_deployment" "my_deployment" { | ||
| name = "test-inference-deployment-basic" | ||
| node_type = "H100" # replace with your node type | ||
| model_id = scaleway_inference_model.my_model.id | ||
|
|
||
| public_endpoint { | ||
| is_enabled = true | ||
| } | ||
|
|
||
| accept_eula = true | ||
| } | ||
| ``` | ||
|
|
||
| ## Argument Reference | ||
|
|
||
| - `name` - (Required) The name of the model. This must be unique within the project. | ||
| - `url` - (Required) The HTTPS source URL from which the model will be downloaded. This is typically a Hugging Face repository URL (e.g., https://huggingface.co/agentica-org/DeepCoder-14B-Preview). The URL must be publicly accessible or require valid credentials via `secret` | ||
| - `secret` - (Optional, Sensitive) Authentication token used to pull the model from a private or gated URL (e.g., a Hugging Face access token with read permission). | ||
| - `region` - (Defaults to [provider](../index.md#region) `region`) The [region](../guides/regions_and_zones.md#regions) in which the deployment is created. | ||
| - `project_id` - (Defaults to [provider](../index.md#project_id) `project_id`) The ID of the project the deployment is associated with. | ||
|
|
||
| ## Attributes Reference | ||
|
|
||
| In addition to all arguments above, the following attributes are exported: | ||
|
|
||
| - `id` - The unique identifier of the model. | ||
| - `tags` - Tags associated with the model. | ||
| - `status` - The current status of the model (e.g., ready, error, etc.). | ||
| - `description` - A textual description of the model (if available). | ||
| - `has_eula` - Whether the model requires end-user license agreement acceptance before use. | ||
| - `parameter_size_bits` - Size, in bits, of the model parameters. | ||
| - `size_bytes` - Total size, in bytes, of the model archive. | ||
| - `nodes_support` - List of supported node types and their quantization options. Each entry contains: | ||
| - `node_type_name` - The type of node supported. | ||
| - `quantization` - A list of supported quantization options, including: | ||
| - `quantization_bits` - Number of bits used for quantization (e.g., 8, 16). | ||
| - `allowed` - Whether this quantization is allowed. | ||
| - `max_context_size` - Maximum context length supported by this quantization. | ||
|
|
||
| ## Import | ||
|
|
||
| Models can be imported using, `{region}/{id}`, as shown below: | ||
|
|
||
| ```bash | ||
| terraform import scaleway_inference_model.my_model fr-par/11111111-1111-1111-1111-111111111111 | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.