diff --git a/pages/edge-services/concepts.mdx b/pages/edge-services/concepts.mdx index 3d0522c96f..03ccdcfc39 100644 --- a/pages/edge-services/concepts.mdx +++ b/pages/edge-services/concepts.mdx @@ -23,6 +23,8 @@ The SSL/TLS certificate for your subdomain to enable Edge Services to serve cont The CNAME record pointing your subdomain to the Edge Services endpoint, if you have customized your [Edge Services endpoint](#endpoint). This is necessary to ensure that traffic for your customized subdomain is correctly directed towards the Edge Services endpoint by DNS servers. +Refer to [CNAME records for Edge Services](/edge-services/reference-content/cname-record/) for more information. + ## Edge Services Edge Services is an additional feature for Scaleway Load Balancers and Object Storage buckets. It provides: @@ -30,6 +32,8 @@ Edge Services is an additional feature for Scaleway Load Balancers and Object St - A [Web Application Firewall](/edge-services/how-to/configure-waf/) to protect your origin from threats and malicious activity - A customizable and secure [endpoint](#endpoint) for accessing content via Edge Services, which can be set to a subdomain of your choice. +Read the [Edge Services Quickstart](/edge-services/quickstart/) to get started. + ## Endpoint The endpoint from which a given Edge Services pipeline can be accessed, e.g. `https://pipeline-id.svc.edge.scw.cloud`. When a client requests content from the Edge Services endpoint, it is served by Edge Services and its cache, rather than from the origin (Object Storage bucket or Load Balancer backend servers) directly. Edge Services automatically manages redirection from HTTP to HTTPS. @@ -38,7 +42,7 @@ The endpoint can be customized with a user-defined subdomain, allowing you to re ## Exclusions -In the context of an Edge Services [Web Application Firewall](#web-application-firewall), exclusions let you define filters for requests that should not be evaluated by WAF, but rather pass straight to the Load Balancer origin. Learn more about [creating exclusions](/edge-services/how-to/configure-waf/#how-to-set-exclusions) +In the context of an Edge Services [Web Application Firewall](#web-application-firewall), exclusions let you define filters for requests that should not be evaluated by WAF, but rather pass straight to the Load Balancer origin. Learn more about [creating exclusions](/edge-services/how-to/configure-waf/#how-to-set-exclusions). ## Origin diff --git a/pages/generative-apis/concepts.mdx b/pages/generative-apis/concepts.mdx index 0ad0970726..e15498ad37 100644 --- a/pages/generative-apis/concepts.mdx +++ b/pages/generative-apis/concepts.mdx @@ -10,6 +10,8 @@ dates: API rate limits define the maximum number of requests a user can make to the Generative APIs within a specific time frame. Rate limiting helps to manage resource allocation, prevent abuse, and ensure fair access for all users. Understanding and adhering to these limits is essential for maintaining optimal application performance using these APIs. +Refer to the [Rate limits](/generative-apis/reference-content/rate-limits/) documentation for more information. + ## Context window A context window is the maximum amount of prompt data considered by the model to generate a response. Using models with high context length, you can provide more information to generate relevant responses. The context is measured in tokens. @@ -18,14 +20,20 @@ A context window is the maximum amount of prompt data considered by the model to Function calling allows a large language model (LLM) to interact with external tools or APIs, executing specific tasks based on user requests. The LLM identifies the appropriate function, extracts the required parameters, and returns the results as structured data, typically in JSON format. +Refer to [How to use function calling](/generative-apis/how-to/use-function-calling/) for more information. + ## Embeddings Embeddings are numerical representations of text data that capture semantic information in a dense vector format. In Generative APIs, embeddings are essential for tasks such as similarity matching, clustering, and serving as inputs for downstream models. These vectors enable the model to understand and generate text based on the underlying meaning rather than just the surface-level words. +Refer to [How to query embedding models](/generative-apis/how-to/query-embedding-models/) for more information. + ## Error handling Error handling refers to the strategies and mechanisms in place to manage and respond to errors during API requests. This includes handling network issues, invalid inputs, or server-side errors. Proper error handling ensures that applications using Generative APIs can gracefully recover from failures and provide meaningful feedback to users. +Refer to [Understanding errors](/generative-apis/api-cli/understanding-errors/) for more information. + ## Parameters Parameters are settings that control the behavior and performance of generative models. These include temperature, max tokens, and top-p sampling, among others. Adjusting parameters allows users to tweak the model's output, balancing factors like creativity, accuracy, and response length to suit specific use cases. @@ -62,6 +70,8 @@ Structured outputs enable you to format the model's responses to suit specific u By customizing the structure, such as using lists, tables, or key-value pairs, you ensure that the data returned is in a form that is easy to extract and process. By specifying the expected response format through the API, you can make the model consistently deliver the output your system requires. +Refer to [How to use structured outputs](/generative-apis/how-to/query-vision-models/) for more information. + ## Temperature Temperature is a parameter that controls the randomness of the model's output during text generation. A higher temperature produces more creative and diverse outputs, while a lower temperature makes the model's responses more deterministic and focused. Adjusting the temperature allows users to balance creativity with coherence in the generated text. diff --git a/pages/interlink/concepts.mdx b/pages/interlink/concepts.mdx index 3b057b6eab..74d1d94358 100644 --- a/pages/interlink/concepts.mdx +++ b/pages/interlink/concepts.mdx @@ -69,13 +69,17 @@ When creating an InterLink, you must specify a [region](/vpc/concepts/#region-an ## Route propagation -Route propagation can be activated or deactivated at any given time on an InterLink. When activated, the Scaleway VPC and external infrastructure communicate via BGP sessions to dynamically exchange and update information about their routes. Route propagation must be activated to allow traffic to flow over the InterLink. When deactivated, all pre-learned/announced routes are removed from the VPC's route table, and traffic cannot flow. Note that even with route propagation activated, the default rule blocks all route announcements: you must attach a [routing policy](#routing-policy) to specify the route ranges to whitelist. [Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interLink) +Route propagation can be activated or deactivated at any given time on an InterLink. When activated, the Scaleway VPC and external infrastructure communicate via BGP sessions to dynamically exchange and update information about their routes. Route propagation must be activated to allow traffic to flow over the InterLink. When deactivated, all pre-learned/announced routes are removed from the VPC's route table, and traffic cannot flow. Note that even with route propagation activated, the default rule blocks all route announcements: you must attach a [routing policy](#routing-policy) to specify the route ranges to whitelist. + +[Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interlink). ## Routing policy The default rule blocks any and all routes from being propagated over InterLink. Attaching a routing policy allows you to define the ranges of routes that should be whitelisted. When creating a routing policy, you specify one or many IP ranges representing the outgoing routes to announce from the Scaleway VPC, and one or many IP ranges representing the incoming route announcements to accept from the external infrastructure. -IPv4 and IPv6 routes must be whitelisted in separate routing policies. An InterLink must therefore have a **minimum of one** and a **maximum of two** attached routing policies, one for each IP traffic type to be routed (IPv4 and/or IPv6). When [route propagation](#route-propagation) is activated, the route ranges defined in the attached routing policies are whitelisted, and traffic can flow across the InterLink along these routes. [Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interLink). +IPv4 and IPv6 routes must be whitelisted in separate routing policies. An InterLink must therefore have a **minimum of one** and a **maximum of two** attached routing policies, one for each IP traffic type to be routed (IPv4 and/or IPv6). When [route propagation](#route-propagation) is activated, the route ranges defined in the attached routing policies are whitelisted, and traffic can flow across the InterLink along these routes. + +[Learn more about routing across an InterLink](/interlink/reference-content/overview/#routing-across-an-interlink). ## Self-hosted InterLink diff --git a/pages/load-balancer/concepts.mdx b/pages/load-balancer/concepts.mdx index 4db8c80e01..3cfb9f5c59 100644 --- a/pages/load-balancer/concepts.mdx +++ b/pages/load-balancer/concepts.mdx @@ -46,7 +46,7 @@ Load Balancers support three different modes of balancing load (requests) betwee * **Least connections**: Each request is assigned to the server with the fewest active connections. This method works best when it is expected that sessions will be long, e.g. LDAP, SQL TSE. It is less well suited to protocols with typically short sessions like HTTP. * **First available**: Each request is directed towards the first backend server with available connection slots. Once a server reaches its limit of maximum simultaneous connections, requests are directed to the next server. This method uses the smallest number of servers at any given time, which can be useful if, for example, you sometimes want to power off extra servers during off-peak hours. -For more information about balancing rules, refer to our [blog post "What is a load balancer"](https://www.scaleway.com/en/blog/what-is-a-load-balancer/) +For more information about balancing rules, refer to our [blog post "What is a load balancer"](https://www.scaleway.com/en/blog/what-is-a-load-balancer/), or our [backend configuration](/load-balancer/reference-content/configuring-backends/#balancing-method) documentation. ## Certificate @@ -82,6 +82,8 @@ Edge Services is an additional feature for Scaleway Load Balancers and Object St +Read the [Edge Services Quickstart](/edge-services/quickstart/) to get started. + ## First available See [balancing-methods](#balancing-methods). @@ -166,7 +168,7 @@ Routes allow you to specify, for a given frontend, which of its backends it shou ## Object Storage failover -See [customized error page](#customized-error-page) +See [customized error page](#customized-error-page). ## Server Name Identification (SNI) diff --git a/pages/managed-inference/concepts.mdx b/pages/managed-inference/concepts.mdx index f22cca9fd0..16346f1f42 100644 --- a/pages/managed-inference/concepts.mdx +++ b/pages/managed-inference/concepts.mdx @@ -38,6 +38,8 @@ It demonstrates the model's ability to generalize from limited training data to Function calling allows a large language model (LLM) to interact with external tools or APIs, executing specific tasks based on user requests. The LLM identifies the appropriate function, extracts the required parameters, and returns the results as structured data, typically in JSON format. +Refer to [Support for function calling in Scaleway Managed Inference](/managed-inference/reference-content/function-calling-support/) for more information. + ## Hallucinations Hallucinations in LLMs refer to instances where generative AI models generate responses that, while grammatically coherent, contain inaccuracies or nonsensical information. These inaccuracies are termed "hallucinations" because the models create false or misleading content. Hallucinations can occur because of constraints in the training data, biases embedded within the models, or the complex nature of language itself.