Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions pages/account/how-to/open-a-support-ticket.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Providing a clear subject and description will help us resolve your issue faster
Example: “The issue occurs when attempting to start an Instance after applying a configuration update in the Scaleway console.”

- **Expected behavior:** explain what you expected to happen.
Example: “The instance should start within 2 minutes without errors.”
Example: “The Instance should start within 2 minutes without errors.”

- **Actual behavior:** describe what is happening instead.
Example: “The Instance remains in "Starting" status for over 10 minutes and then switches to "Error".
Expand All @@ -71,7 +71,6 @@ Examples:
- Screenshot of the network tab of your browser’s Developer Tools (right-click anywhere on the page and select **Inspect**. Go to the **Network tab** in the Developer Tools panel.)
- Logs


<Message type="important">
If you have lost access to the Scaleway console and want to create a ticket, you must first [follow this procedure](/account/how-to/use-2fa/#how-to-regain-access-to-your-account) to regain access to your account.
</Message>
</Message>
5 changes: 1 addition & 4 deletions pages/audit-trail/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,4 @@ Refer to the [dedicated documentation page](/audit-trail/how-to/configure-audit-

<Message type="tip">
If no events display after you use the filter, try switching the region from the **Region** drop-down, or adjusting your search. Find out how to troubleshoot event issues in our [dedicated documentation](/audit-trail/troubleshooting/cannot-see-events/).
</Message>



</Message>
Original file line number Diff line number Diff line change
Expand Up @@ -52,5 +52,5 @@ This page shows you how to configure alerts for Scaleway resources in Grafana us
</Message>

<Message type="tip">
Find out how to send Cockpit's alert notifications to Slack using a webkook URL in our [dedicated documentation](/tutorials/configure-slack-alerting/).
Find out how to send Cockpit's alert notifications to Slack using a webhook URL in our [dedicated documentation](/tutorials/configure-slack-alerting/).
</Message>
2 changes: 1 addition & 1 deletion pages/cockpit/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ meta:
<Grid>

<DefaultCard
title="Sending Cockpit's alert notifications to Slack using a webkook URL"
title="Sending Cockpit's alert notifications to Slack using a webhook URL"
url="/tutorials/configure-slack-alerting/"
label="Read more"
/>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ categories:
For a detailed description of how the water consumption is calculated, refer to the [Water Consumption section](/environmental-footprint/additional-content/environmental-footprint-calculator/#water-consumption) of the Environmental Footprint calculation breakdown documentation page.
</Message>
- **5.** The total water consumption and carbon footprint of each of your Projects.
- **6.** The total water consumption and carbon footprint per geographical location (Region and Availability Zone)
- **6.** The total water consumption and carbon footprint per geographical location (region and Availability Zone)
- **7.** The total water consumption and carbon footprint of each of your products.

For both the carbon emissions, and the water consumption, the power consumption of your active resources is used in the calculation. The way you use your resources has a direct impact on power consumption. Therefore, results may vary greatly from one month to another.
Expand Down
34 changes: 17 additions & 17 deletions pages/generative-apis/troubleshooting/fixing-common-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ Below are common issues that you may encounter when using Generative APIs, their
## 400: Bad Request - You exceeded maximum context window for this model

### Cause
- You provided an input exceeding the maximum context window (also known as context length) for the model you are using.
- You provided a long input and requested a long input (in `max_completion_tokens` field), which added together, exceed the maximum context window of the model you are using.
- You provided an input exceeding the maximum context window (also known as context length) for the model you are using.
- You provided a long input and requested a long input (in `max_completion_tokens` field), which added together, exceeds the maximum context window of the model you are using.

### Solution
- Reduce your input size below what is [supported by the model](/generative-apis/reference-content/supported-models/).
- Reduce your input size below what is [supported by the model](/generative-apis/reference-content/supported-models/).
- Use a model supporting longer context window values.
- Use [Managed Inference](/managed-inference/), where the context window can be increased for [several configurations with additional GPU vRAM](/managed-inference/reference-content/supported-models/). For instance, `llama-3.3-70b-instruct` model in `fp8` quantization can be served with:
- `15k` tokens context window on `H100` instances
- `128k` tokens context window on `H100-2` instances.
- `15k` tokens context window on `H100` Instances
- `128k` tokens context window on `H100-2` Instances

## 403: Forbidden - Insufficient permissions to access the resource

Expand All @@ -46,7 +46,7 @@ Below are common issues that you may encounter when using Generative APIs, their
- You provided a value for `max_completion_tokens` that is too high and not supported by the model you are using.

### Solution
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
- Remove `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
- As an example, when using the [init_chat_model from Langchain](https://python.langchain.com/api_reference/_modules/langchain/chat_models/base.html#init_chat_model), you should edit the `max_tokens` value in the following configuration:
```python
llm = init_chat_model("llama-3.3-70b-instruct", max_tokens="8000", model_provider="openai", base_url="https://api.scaleway.ai/v1", temperature=0.7)
Expand All @@ -57,16 +57,16 @@ Below are common issues that you may encounter when using Generative APIs, their
## 416: Range Not Satisfiable - max_completion_tokens is limited for this model

### Cause
- You provided `max_completion_tokens` value too high, that is not supported by the model you are using.
- You provided `max_completion_tokens` value too high, which is not supported by the model you are using.

### Solution
- Remove the `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
- Remove the `max_completion_tokens` field from your request or client library, or reduce its value below what is [supported by the model](https://www.scaleway.com/en/docs/generative-apis/reference-content/supported-models/).
- As an example, when using the [init_chat_model from Langchain](https://python.langchain.com/api_reference/_modules/langchain/chat_models/base.html#init_chat_model), you should edit the `max_tokens` value in the following configuration:
```python
llm = init_chat_model("llama-3.3-70b-instruct", max_tokens="8000", model_provider="openai", base_url="https://api.scaleway.ai/v1", temperature=0.7)
```
- Use a model supporting a higher `max_completion_tokens` value.
- Use [Managed Inference](/managed-inference/), where these limits on completion tokens do not apply (your completion tokens amount will still be limited by the maximum context window supported by the model).
- Use [Managed Inference](/managed-inference/), where these limits on completion tokens do not apply (your completion tokens amount will still be limited by the maximum context window supported by the model).

## 429: Too Many Requests - You exceeded your current quota of requests/tokens per minute

Expand All @@ -79,15 +79,15 @@ Below are common issues that you may encounter when using Generative APIs, their
- [Add a payment method](/billing/how-to/add-payment-method/#how-to-add-a-credit-card) and [validate your identity](/account/how-to/verify-identity/) to increase automatically your quotas [based on standard limits](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- [Ask our support](https://console.scaleway.com/support/tickets/create) to raise your quota.
- Reduce the size of the input or output tokens processed by your API requests.
- Use [Managed Inference](/managed-inference/), where these quota do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)
- Use [Managed Inference](/managed-inference/), where these quotas do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)

## 429: Too Many Requests - You exceeded your current threshold of concurrent requests

### Cause
- You kept too many API requests opened at the same time (number of HTTP sessions opened in parallel)

### Solution
- Smooth out your API requests rate by limiting the number of API requests you perform at the same time (eg. requests which did not receive a complete response and are still opened) so that you remain below your [organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- Smooth out your API requests rate by limiting the number of API requests you perform at the same time (eg. requests which did not receive a complete response and are still opened) so that you remain below your [Organization quotas for Generative APIs](/organizations-and-projects/additional-content/organization-quotas/#generative-apis).
- Use [Managed Inference](/managed-inference/), where concurrent request limit do not apply. Note that exceeding the number of concurrent requests your Inference Deployment can handle may impact performance metrics.


Expand Down Expand Up @@ -162,15 +162,15 @@ Below are common issues that you may encounter when using Generative APIs, their
- Counter for **Tokens Processed** or **API Requests** should display a correct value (different from 0)
- Graph across time should be empty

## Embeddings vectors cannot be stored in database or used with a third-party library
## Embeddings vectors cannot be stored in a database or used with a third-party library

### Cause
The embedding model you are using generates vector representations with a fixed dimension number, which is too high for your database or third-party library.
- For example, the embedding model `bge-multilingual-gemma2` generates vector representations with `3584` dimensions. However, when storing vectors using PostgreSQL `pgvector` extensions, indexes (in `hnsw` or `ivvflat` formats) only support up to `2000` dimensions.

### Solution
- Use a vector store supporting higher dimensions number, such as [Qdrant](https://www.scaleway.com/en/docs/tutorials/deploying-qdrant-vectordb-kubernetes/).
- Do not use indexes for vectors or disable them from your third-party library. This may limit performance in vector similarity search for significant volumes.
- Use a vector store supporting higher dimensions numbers, such as [Qdrant](https://www.scaleway.com/en/docs/tutorials/deploying-qdrant-vectordb-kubernetes/).
- Do not use indexes for vectors or disable them from your third-party library. This may limit performance in vector similarity search for significant volumes.
- When using [Langchain PGVector method](https://python.langchain.com/docs/integrations/vectorstores/pgvector/), this method does not create an index by default and should not raise errors.
- When using the [Mastra](https://mastra.ai/) library with `vectorStoreName: "pgvector"`, specify indexConfig type as `flat` to avoid creating any index on vector dimensions.
```typescript
Expand All @@ -180,7 +180,7 @@ The embedding model you are using generates vector representations with a fixed
indexConfig: {"type":"flat"},
});
```
- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance the`sentence-t5-xxl` model, which represents vectors with `768` dimensions.
- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance the`sentence-t5-xxl` model, which represents vectors with `768` dimensions.

## Previous messages are not taken into account by the model

Expand Down Expand Up @@ -219,7 +219,7 @@ response = client.chat.completions.create(
print(response.choices[0].message.content)
```
This snippet will output the model response, which is `4`.
- When exceeding maximum context window, you should receive a `400 - BadRequestError` detailing context length value you exceeded. In this case, you should reduce the size of the content you send to the API.
- When exceeding the maximum context window, you should receive a `400 - BadRequestError` detailing the context length value you exceeded. In this case, you should reduce the size of the content you send to the API.

## Best practices for optimizing model performance

Expand All @@ -234,4 +234,4 @@ This snippet will output the model response, which is `4`.
### Debugging silent errors
- For cases where no explicit error is returned:
- Verify all fields in the API request are correctly named and formatted.
- Test the request with smaller and simpler inputs to isolate potential issues.
- Test the request with smaller and simpler inputs to isolate potential issues.
6 changes: 3 additions & 3 deletions pages/gpu/how-to/use-nvidia-mig-technology.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ categories:

<Message type="note">
* Scaleway offers MIG-compatible GPU Instances such as H100 PCIe GPU Instances
* NVIDIA uses the term *GPU instance* to designate a MIG partition of a GPU (MIG= Multi-Instance GPU)
* NVIDIA uses the term *GPU instance* to designate an MIG partition of a GPU (MIG= Multi-Instance GPU)
* To avoid confusion, we will use the term GPU Instance in this document to designate the Scaleway GPU Instance, and *MIG partition* in the context of the MIG feature.
</Message>

Expand Down Expand Up @@ -151,10 +151,10 @@ Refer to the official documentation for more information about the supported [MI
* `-cgi 9,19,19,19`: this flag specifies the MIG partition configuration. The numbers following the flag represent the MIG partitions for each of the four MIG device slices. In this case, there are four slices with configurations 9, 19, 19, and 19 compute instances each. These numbers correspond to the profile IDs retrieved previously. Note that you can use either of the following:
* Profile ID (e.g. 9, 14, 5)
* Short name of the profile (e.g. `3g.40gb`)
* Full profile name of the instance (e.g. `MIG 3g.40gb`)
* Full profile name of the Instance (e.g. `MIG 3g.40gb`)
* `-C`: this flag automatically creates the corresponding compute instances for the MIG partitions.

The command instructs the `nvidia-smi` tool to set up a MIG configuration where the GPU is divided into four slices, each containing different numbers of MIG partition configurations as specified: an MIG 3g.40gb (Profile ID 9) for the first slice, and an MIG 1g.10gb (Profile ID 19) for each of the remaining three slices.
The command instructs the `nvidia-smi` tool to set up an MIG configuration where the GPU is divided into four slices, each containing different numbers of MIG partition configurations as specified: an MIG 3g.40gb (Profile ID 9) for the first slice, and an MIG 1g.10gb (Profile ID 19) for each of the remaining three slices.

<Message type="note">
- Running CUDA workloads on the GPU requires the creation of MIG partitions along with their corresponding compute instances. Just enabling MIG mode on the GPU is not enough to achieve this.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,8 @@ categories:
- compute
---


Scaleway GPU Instances are designed to deliver **high-performance computing** for AI/ML workloads, rendering, scientific simulations, and visualization tasks.
This guide provides a detailed overview of their **internet and Block Storage bandwidth capabilities** to help you choose the right instance for your GPU-powered workloads.
This guide provides a detailed overview of their **internet and Block Storage bandwidth capabilities** to help you choose the right Instance for your GPU-powered workloads.

### Why bandwidth matters for GPU Instances

Expand Down
6 changes: 2 additions & 4 deletions pages/instances/api-cli/using-cloud-init.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Cloud-config files are special scripts designed to be run by the cloud-init proc

You can give provisioning instructions to cloud-init using the `cloud-init` key of the `user_data` facility.

For `user_data` to be effective, it has to be added prior to the creation of the instance since `cloud-init` gets activated early in the first phases of the boot process.
For `user_data` to be effective, it has to be added prior to the creation of the Instance since `cloud-init` gets activated early in the first phases of the boot process.

* **Server ID** refers to the unique identification string of your server. It will be displayed when you create your server. You can also recover it from the list of your servers, by typing `scw instance server list`.

Expand Down Expand Up @@ -88,6 +88,4 @@ Subcommands:
````

For detailed information on cloud-init, refer to the official cloud-init [documentation](http://cloudinit.readthedocs.io/en/latest/index.html).


For detailed information on cloud-init, refer to the official cloud-init [documentation](http://cloudinit.readthedocs.io/en/latest/index.html).
6 changes: 2 additions & 4 deletions pages/instances/api-cli/using-routed-ips.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -491,7 +491,7 @@ Then you can create a new Instance using those IPs through the `public_ips` fiel
❯ http post $API_URL/servers $HEADERS <payloads/server-data.json
```
<Message type="tip">
In order to create Instance you have to add `"routed_ip_enabled": true` to your payload.
To create an Instance, you must add `"routed_ip_enabled": true` to your payload.
</Message>
</TabsTab>
<TabsTab label="Response">
Expand Down Expand Up @@ -648,7 +648,7 @@ You can use a specific server action to move an existing (legacy network) Instan
❯ http post $API_URL/servers/$SERVER_ID/action $HEADERS action=enable_routed_ip
```
<Message type="note">
Your instance *will* reboot during this action.
Your Instance *will* reboot during this action.
</Message>
</TabsTab>
<TabsTab label="Response">
Expand Down Expand Up @@ -1002,5 +1002,3 @@ You can verify if your Instance is enabled for routed IPs through the `/servers`
```
</TabsTab>
</Tabs>


Loading