Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion pages/edge-services/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,17 @@ The SSL/TLS certificate for your subdomain to enable Edge Services to serve cont

The CNAME record pointing your subdomain to the Edge Services endpoint, if you have customized your [Edge Services endpoint](#endpoint). This is necessary to ensure that traffic for your customized subdomain is correctly directed towards the Edge Services endpoint by DNS servers.

Refer to [CNAME records for Edge Services](/edge-services/reference-content/cname-record/) for more information.

## Edge Services

Edge Services is an additional feature for Scaleway Load Balancers and Object Storage buckets. It provides:
- A [caching service](/edge-services/how-to/configure-cache/) to improve performance by reducing load on your [origin](#origin)
- A [Web Application Firewall](/edge-services/how-to/configure-waf/) to protect your origin from threats and malicious activity
- A customizable and secure [endpoint](#endpoint) for accessing content via Edge Services, which can be set to a subdomain of your choice.

Read the [Edge Services Quickstart](/edge-services/quickstart/) to get started.

## Endpoint

The endpoint from which a given Edge Services pipeline can be accessed, e.g. `https://pipeline-id.svc.edge.scw.cloud`. When a client requests content from the Edge Services endpoint, it is served by Edge Services and its cache, rather than from the origin (Object Storage bucket or Load Balancer backend servers) directly. Edge Services automatically manages redirection from HTTP to HTTPS.
Expand All @@ -38,7 +42,7 @@ The endpoint can be customized with a user-defined subdomain, allowing you to re

## Exclusions

In the context of an Edge Services [Web Application Firewall](#web-application-firewall), exclusions let you define filters for requests that should not be evaluated by WAF, but rather pass straight to the Load Balancer origin. Learn more about [creating exclusions](/edge-services/how-to/configure-waf/#how-to-set-exclusions)
In the context of an Edge Services [Web Application Firewall](#web-application-firewall), exclusions let you define filters for requests that should not be evaluated by WAF, but rather pass straight to the Load Balancer origin. Learn more about [creating exclusions](/edge-services/how-to/configure-waf/#how-to-set-exclusions).

## Origin

Expand Down
10 changes: 10 additions & 0 deletions pages/generative-apis/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ dates:

API rate limits define the maximum number of requests a user can make to the Generative APIs within a specific time frame. Rate limiting helps to manage resource allocation, prevent abuse, and ensure fair access for all users. Understanding and adhering to these limits is essential for maintaining optimal application performance using these APIs.

Refer to the [Rate limits](/generative-apis/reference-content/rate-limits/) documentation for more information.

## Context window

A context window is the maximum amount of prompt data considered by the model to generate a response. Using models with high context length, you can provide more information to generate relevant responses. The context is measured in tokens.
Expand All @@ -18,14 +20,20 @@ A context window is the maximum amount of prompt data considered by the model to

Function calling allows a large language model (LLM) to interact with external tools or APIs, executing specific tasks based on user requests. The LLM identifies the appropriate function, extracts the required parameters, and returns the results as structured data, typically in JSON format.

Refer to [How to use function calling](/generative-apis/how-to/use-function-calling/) for more information.

## Embeddings

Embeddings are numerical representations of text data that capture semantic information in a dense vector format. In Generative APIs, embeddings are essential for tasks such as similarity matching, clustering, and serving as inputs for downstream models. These vectors enable the model to understand and generate text based on the underlying meaning rather than just the surface-level words.

Refer to [How to query embedding models](/generative-apis/how-to/query-embedding-models/) for more information.

## Error handling

Error handling refers to the strategies and mechanisms in place to manage and respond to errors during API requests. This includes handling network issues, invalid inputs, or server-side errors. Proper error handling ensures that applications using Generative APIs can gracefully recover from failures and provide meaningful feedback to users.

Refer to [Understanding errors](/generative-apis/api-cli/understanding-errors/) for more information.

## Parameters

Parameters are settings that control the behavior and performance of generative models. These include temperature, max tokens, and top-p sampling, among others. Adjusting parameters allows users to tweak the model's output, balancing factors like creativity, accuracy, and response length to suit specific use cases.
Expand Down Expand Up @@ -62,6 +70,8 @@ Structured outputs enable you to format the model's responses to suit specific u
By customizing the structure, such as using lists, tables, or key-value pairs, you ensure that the data returned is in a form that is easy to extract and process.
By specifying the expected response format through the API, you can make the model consistently deliver the output your system requires.

Refer to [How to use structured outputs](/generative-apis/how-to/query-vision-models/) for more information.

## Temperature

Temperature is a parameter that controls the randomness of the model's output during text generation. A higher temperature produces more creative and diverse outputs, while a lower temperature makes the model's responses more deterministic and focused. Adjusting the temperature allows users to balance creativity with coherence in the generated text.
Expand Down
8 changes: 6 additions & 2 deletions pages/interlink/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,13 +69,17 @@ When creating an InterLink, you must specify a [region](/vpc/concepts/#region-an

## Route propagation

Route propagation can be activated or deactivated at any given time on an InterLink. When activated, the Scaleway VPC and external infrastructure communicate via BGP sessions to dynamically exchange and update information about their routes. Route propagation must be activated to allow traffic to flow over the InterLink. When deactivated, all pre-learned/announced routes are removed from the VPC's route table, and traffic cannot flow. Note that even with route propagation activated, the default rule blocks all route announcements: you must attach a [routing policy](#routing-policy) to specify the route ranges to whitelist. [Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interLink)
Route propagation can be activated or deactivated at any given time on an InterLink. When activated, the Scaleway VPC and external infrastructure communicate via BGP sessions to dynamically exchange and update information about their routes. Route propagation must be activated to allow traffic to flow over the InterLink. When deactivated, all pre-learned/announced routes are removed from the VPC's route table, and traffic cannot flow. Note that even with route propagation activated, the default rule blocks all route announcements: you must attach a [routing policy](#routing-policy) to specify the route ranges to whitelist.

[Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interlink).

## Routing policy

The default rule blocks any and all routes from being propagated over InterLink. Attaching a routing policy allows you to define the ranges of routes that should be whitelisted. When creating a routing policy, you specify one or many IP ranges representing the outgoing routes to announce from the Scaleway VPC, and one or many IP ranges representing the incoming route announcements to accept from the external infrastructure.

IPv4 and IPv6 routes must be whitelisted in separate routing policies. An InterLink must therefore have a **minimum of one** and a **maximum of two** attached routing policies, one for each IP traffic type to be routed (IPv4 and/or IPv6). When [route propagation](#route-propagation) is activated, the route ranges defined in the attached routing policies are whitelisted, and traffic can flow across the InterLink along these routes. [Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interLink).
IPv4 and IPv6 routes must be whitelisted in separate routing policies. An InterLink must therefore have a **minimum of one** and a **maximum of two** attached routing policies, one for each IP traffic type to be routed (IPv4 and/or IPv6). When [route propagation](#route-propagation) is activated, the route ranges defined in the attached routing policies are whitelisted, and traffic can flow across the InterLink along these routes.

[Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interlink).

## Self-hosted InterLink

Expand Down
6 changes: 4 additions & 2 deletions pages/load-balancer/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Load Balancers support three different modes of balancing load (requests) betwee
* **Least connections**: Each request is assigned to the server with the fewest active connections. This method works best when it is expected that sessions will be long, e.g. LDAP, SQL TSE. It is less well suited to protocols with typically short sessions like HTTP.
* **First available**: Each request is directed towards the first backend server with available connection slots. Once a server reaches its limit of maximum simultaneous connections, requests are directed to the next server. This method uses the smallest number of servers at any given time, which can be useful if, for example, you sometimes want to power off extra servers during off-peak hours.

For more information about balancing rules, refer to our [blog post "What is a load balancer"](https://www.scaleway.com/en/blog/what-is-a-load-balancer/)
For more information about balancing rules, refer to our [blog post "What is a load balancer"](https://www.scaleway.com/en/blog/what-is-a-load-balancer/), or our [backend configuration](/load-balancer/reference-content/configuring-backends/#balancing-method) documentation.

## Certificate

Expand Down Expand Up @@ -82,6 +82,8 @@ Edge Services is an additional feature for Scaleway Load Balancers and Object St

<EdgeServicesLbBenefits />

Read the [Edge Services Quickstart](/edge-services/quickstart/) to get started.

## First available

See [balancing-methods](#balancing-methods).
Expand Down Expand Up @@ -166,7 +168,7 @@ Routes allow you to specify, for a given frontend, which of its backends it shou

## Object Storage failover

See [customized error page](#customized-error-page)
See [customized error page](#customized-error-page).

## Server Name Identification (SNI)

Expand Down
2 changes: 2 additions & 0 deletions pages/managed-inference/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ It demonstrates the model's ability to generalize from limited training data to

Function calling allows a large language model (LLM) to interact with external tools or APIs, executing specific tasks based on user requests. The LLM identifies the appropriate function, extracts the required parameters, and returns the results as structured data, typically in JSON format.

Refer to [Support for function calling in Scaleway Managed Inference](/managed-inference/reference-content/function-calling-support/) for more information.

## Hallucinations

Hallucinations in LLMs refer to instances where generative AI models generate responses that, while grammatically coherent, contain inaccuracies or nonsensical information. These inaccuracies are termed "hallucinations" because the models create false or misleading content. Hallucinations can occur because of constraints in the training data, biases embedded within the models, or the complex nature of language itself.
Expand Down