Skip to content

Commit 13c0412

Browse files
bene2k1nerda-codesjcirinosclwynox-404
authored
fix(k8s): update docs with feedback from devops (#5563)
* fix(k8s): update docs with feedback from devops * docs(k8s): update tutorial * fix(k8s): remove vcpu * feat(k8s): add mutualized control plane to table * Update pages/kubernetes/reference-content/introduction-to-kubernetes.mdx Co-authored-by: Néda <[email protected]> * Apply suggestions from code review Co-authored-by: Jessica <[email protected]> Co-authored-by: Nox <[email protected]> * docs(k8s): update pod * docs(k8s): update wording * fix(k8s): wording * Apply suggestions from code review Co-authored-by: Nox <[email protected]> --------- Co-authored-by: Néda <[email protected]> Co-authored-by: Jessica <[email protected]> Co-authored-by: Nox <[email protected]>
1 parent d43828e commit 13c0412

File tree

49 files changed

+209
-203
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+209
-203
lines changed

changelog/february2025/2025-02-25-kubernetes-added-data-plane-logs-in-cockpit.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,6 @@ category: containers
99
product: kubernetes
1010
---
1111

12-
**Centralized monitoring is now available**, allowing you to send Kubernetes container logs to Cockpit for streamlined monitoring. Setup is easy with **one-click deployment** via Easy Deploy using Promtail. This feature captures **all container logs**, including pod stdout/stderr and systemd journal. Additionally, you can control ingestion costs with **customizable filtering options**.
12+
**Centralized monitoring is now available**, allowing you to send Kubernetes container logs to Cockpit for streamlined monitoring. Setup is easy with **one-click deployment** via Easy Deploy using Promtail. This feature captures **all container logs**, including Pod stdout/stderr and systemd journal. Additionally, you can control ingestion costs with **customizable filtering options**.
1313

1414
Learn more in our dedicated documentation: [Monitor Data Plane with Cockpit](https://www.scaleway.com/en/docs/kubernetes/how-to/monitor-data-plane-with-cockpit/)

pages/cockpit/how-to/configure-alerts-for-scw-resources.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ Data source managed alert rules allow you to configure alerts managed by the dat
5454

5555
## Define your metric and alert conditions
5656

57-
Switch between the tabs below to create alerts for a Scaleway Instance, an Object Storage bucket, a Kubernetes cluster pod, or Cockpit logs.
57+
Switch between the tabs below to create alerts for a Scaleway Instance, an Object Storage bucket, a Kubernetes cluster Pod, or Cockpit logs.
5858

5959
<Tabs id="install">
6060
<TabsTab label="Scaleway Instance">
@@ -105,15 +105,15 @@ Switch between the tabs below to create alerts for a Scaleway Instance, an Objec
105105
6. Click **Save rule and exit** in the top right corner of your screen to save and activate your alert.
106106
7. Optionally, check that your configuration works by temporarily lowering the threshold. This will trigger the alert and notify your [contacts](/cockpit/concepts/#contact-points).
107107
</TabsTab>
108-
<TabsTab label="Kubernetes pod">
109-
The steps below explain how to create the metric selection and configure an alert condition that triggers when **no new pod activity occurs, which could mean your cluster is stuck or unresponsive.**
108+
<TabsTab label="Kubernetes Pod">
109+
The steps below explain how to create the metric selection and configure an alert condition that triggers when **no new Pod activity occurs, which could mean your cluster is stuck or unresponsive.**
110110

111111
1. In the query field next to the **Loading metrics... >** button, paste the following query. Make sure that the values for the labels you have selected (for example, `resource_name`) correspond to those of the target resource.
112112
```bash
113113
rate(kubernetes_cluster_k8s_shoot_nodes_pods_usage_total{resource_name="k8s-par-quizzical-chatelet"}[15m]) == 0
114114
```
115115
<Message type="tip">
116-
The `kubernetes_cluster_k8s_shoot_nodes_pods_usage_total` metric represents the total number of pods currently running across all nodes in your Kubernetes cluster. It is helpful to monitor current pod consumption per node pool or cluster, and help track resource saturation or unexpected workload spikes.
116+
The `kubernetes_cluster_k8s_shoot_nodes_pods_usage_total` metric represents the total number of Pods currently running across all nodes in your Kubernetes cluster. It is helpful to monitor current Pod consumption per node pool or cluster, and help track resource saturation or unexpected workload spikes.
117117
</Message>
118118
2. In the **Set alert evaluation behavior** field, specify how long the condition must be true before triggering the alert.
119119
3. Enter a name in the **Namespace** and **Group** fields to categorize and manage your alert rules. Rules that share the same group will use the same configuration, including the evaluation interval which determines how often the rule is evaluated (by default: every 1 minute). You can modify this interval later in the group settings.

pages/cockpit/how-to/send-logs-from-k8s-to-cockpit.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: How to send logs from your Kubernetes cluster to your Cockpit
3-
description: Learn how to send your pod logs to your Cockpit using Scaleway's comprehensive guide. This tutorial covers sending Kubernetes pods logs to Scaleway's Cockpit for centralized monitoring and analysis using Grafana, ensuring efficient monitoring and log analysis in your infrastructure.
3+
description: Learn how to send your Pod logs to your Cockpit using Scaleway's comprehensive guide. This tutorial covers sending Kubernetes Pods logs to Scaleway's Cockpit for centralized monitoring and analysis using Grafana, ensuring efficient monitoring and log analysis in your infrastructure.
44
tags: kubernetes cockpit logs observability monitoring cluster
55
dates:
66
validation: 2025-08-20
@@ -93,7 +93,7 @@ Once you have configured your `values.yml` file, you can use Helm to deploy the
9393
<Message type="iam">
9494
The `-f` flag specifies the path to your `values.yml` file, which contains the configuration for the Helm chart. <br /><br />
9595
Helm installs the `k8s-monitoring` chart, which includes the Alloy DaemonSet configured to collect logs from your Kubernetes cluster. <br /><br />
96-
The DaemonSet ensures that a pod is running on each node in your cluster, which collects logs and forwards them to the specified Loki endpoint in your Cockpit.
96+
The DaemonSet ensures that a Pod is running on each node in your cluster, which collects logs and forwards them to the specified Loki endpoint in your Cockpit.
9797
</Message>
9898
3. Optionally, run the following command to check the status of the release and ensure it was installed:
9999

pages/cockpit/how-to/send-metrics-from-k8s-to-cockpit.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: How to send metrics from your Kubernetes cluster to your Cockpit
3-
description: Learn how to send your pod metrics to your Cockpit using Scaleway's comprehensive guide. This tutorial covers sending Kubernetes pods metrics to Scaleway's Cockpit for centralized monitoring and analysis using Grafana, ensuring efficient monitoring and metrics analysis in your infrastructure.
3+
description: Learn how to send your Pod metrics to your Cockpit using Scaleway's comprehensive guide. This tutorial covers sending Kubernetes Pods metrics to Scaleway's Cockpit for centralized monitoring and analysis using Grafana, ensuring efficient monitoring and metrics analysis in your infrastructure.
44
tags: kubernetes cockpit metrics observability monitoring cluster
55
dates:
66
validation: 2025-08-20
@@ -70,7 +70,7 @@ alloy-singleton:
7070
7171
## Add annotations for auto-discovery
7272
73-
Annotations in Kubernetes provide a way to attach metadata to your resources. For `k8s-monitoring`, these annotations signal which pods should be scraped for metrics, and what port to use. In this documentation we are adding annotations to specify we want `k8s-monitoring` to scrape the pods from our deployment. Make sure that you replace `$METRICS_PORT` with the port where your application exposes Prometheus metrics.
73+
Annotations in Kubernetes provide a way to attach metadata to your resources. For `k8s-monitoring`, these annotations signal which Pods should be scraped for metrics, and what port to use. In this documentation we are adding annotations to specify we want `k8s-monitoring` to scrape the Pods from our deployment. Make sure that you replace `$METRICS_PORT` with the port where your application exposes Prometheus metrics.
7474

7575
### Kubernetes deployment template
7676

@@ -153,7 +153,7 @@ Once you have configured your `values.yml` file, you can use Helm to deploy the
153153
<Message type="iam">
154154
The `-f` flag specifies the path to your `values.yml` file, which contains the configuration for the Helm chart. <br /><br />
155155
Helm installs the `k8s-monitoring` chart, which includes the Alloy DaemonSet configured to collect metrics from your Kubernetes cluster. <br /><br />
156-
The DaemonSet ensures that a pod is running on each node in your cluster, which collects metrics and forwards them to the specified Prometheus endpoint in your Cockpit.
156+
The DaemonSet ensures that a Pod is running on each node in your cluster, which collects metrics and forwards them to the specified Prometheus endpoint in your Cockpit.
157157
</Message>
158158
3. Optionally, check the status of the release to ensure it was installed:
159159

pages/data-lab/concepts.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ dates:
88

99
## Apache Spark cluster
1010

11-
An Apache Spark cluster is an orchestrated set of machines over which distributed/Big data calculus is processed. In the case of Scaleway Data Lab, the Apache Spark cluster is a Kubernetes cluster, with Apache Spark installed in each pod. For more details, check out the [Apache Spark documentation](https://spark.apache.org/documentation.html).
11+
An Apache Spark cluster is an orchestrated set of machines over which distributed/Big data calculus is processed. In the case of Scaleway Data Lab, the Apache Spark cluster is a Kubernetes cluster, with Apache Spark installed in each Pod. For more details, check out the [Apache Spark documentation](https://spark.apache.org/documentation.html).
1212

1313
## Data Lab
1414

@@ -40,7 +40,7 @@ A notebook for an Apache Spark cluster is an interactive, web-based tool that al
4040

4141
## Persistent volume
4242

43-
A Persistent Volume (PV) is a cluster-wide storage resource that ensures data persistence beyond the lifecycle of individual pods. Persistent volumes abstract the underlying storage details, allowing administrators to use various storage solutions.
43+
A Persistent Volume (PV) is a cluster-wide storage resource that ensures data persistence beyond the lifecycle of individual Pods. Persistent volumes abstract the underlying storage details, allowing administrators to use various storage solutions.
4444

4545
Apache Spark® executors require storage space for various operations, particularly to shuffle data during wide operations such as sorting, grouping, and aggregation. Wide operations are transformations that require data from different partitions to be combined, often resulting in data movement across the cluster. During the map phase, executors write data to shuffle storage, which is then read by reducers.
4646

pages/gpu/how-to/use-mig-with-kubernetes.mdx

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete
3232

3333
## Configure MIG partitions inside a Kubernetes cluster
3434

35-
1. Find the name of the pods running the Nvidia Driver:
35+
1. Find the name of the Pods running the Nvidia Driver:
3636
```
3737
% kubectl get pods -n kube-system
3838
NAME READY STATUS RESTARTS AGE
@@ -163,7 +163,7 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete
163163

164164
## Deploy containers that use NVIDIA MIG technology partitions
165165

166-
1. Write a deployment file to deploy 8 pods executing NVIDIA SMI.
166+
1. Write a deployment file to deploy 8 Pods executing NVIDIA SMI.
167167
Open a text editor of your choice and create a deployment file `deploy-mig.yaml`, then paste the following content into the file, save it, and exit the editor:
168168
```yaml
169169
apiVersion: v1
@@ -321,7 +321,7 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete
321321
nvidia.com/gpu.product : NVIDIA-H100-PCIe-MIG-1g.10gb
322322
```
323323
324-
2. Deploy the pods:
324+
2. Deploy the Pods:
325325
```
326326
% kubectl create -f deploy-mig.yaml
327327
pod/test-1 created
@@ -334,7 +334,7 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete
334334
pod/test-8 created
335335
```
336336

337-
3. Display the logs of the pods. The pods print their UUID with the `nvidia-smi` command:
337+
3. Display the logs of the Pods. The Pods print their UUID with the `nvidia-smi` command:
338338
```
339339
% kubectl get -f deploy-mig.yaml -o name | xargs -I{} kubectl logs {}
340340
GPU 0: NVIDIA H100 PCIe (UUID: GPU-717ef73c-2d43-4fdc-76d2-1cddef4863bb)
@@ -354,7 +354,7 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete
354354
GPU 0: NVIDIA H100 PCIe (UUID: GPU-717ef73c-2d43-4fdc-76d2-1cddef4863bb)
355355
MIG 1g.10gb Device 0: (UUID: MIG-fdfd2afa-5cbd-5d1d-b1ae-6f0e13cc0ff8)
356356
```
357-
As you can see, seven pods have been executed on different MIG partitions, while the eighth pod had to wait for one of the seven MIG partitions to become available to be executed.
357+
As you can see, seven Pods have been executed on different MIG partitions, while the eighth Pod had to wait for one of the seven MIG partitions to become available to be executed.
358358

359359
4. Clean the deployment:
360360
```
@@ -377,7 +377,7 @@ In this guide, we will explore the capabilities of NVIDIA MIG within a Kubernete
377377
node/scw-k8s-jovial-dubinsky-pool-h100-93a072191d38 labeled
378378
```
379379

380-
2. Check the status of NVIDIA SMI in the driver pod:
380+
2. Check the status of NVIDIA SMI in the driver Pod:
381381
```
382382
% kubectl exec nvidia-driver-daemonset-8t89m -t -n kube-system -- nvidia-smi -L
383383
GPU 0: NVIDIA H100 PCIe (UUID: GPU-717ef73c-2d43-4fdc-76d2-1cddef4863bb)

pages/gpu/reference-content/choosing-gpu-instance-type.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Below, you will find a guide to help you make an informed decision:
3030
* Up to 2 PCIe GPU with [H100 Instances](https://www.scaleway.com/en/h100-pcie-try-it-now/) or 8 PCIe GPU with [L4](https://www.scaleway.com/en/l4-gpu-instance/) or [L4OS](https://www.scaleway.com/en/contact-l40s/) Instances.
3131
* Or better, an HGX-based server setup with up to 8x NVlink GPUs with [H100-SXM Instances](/gpu/reference-content/choosing-gpu-instance-type/)
3232
* A [supercomputer architecture](https://www.scaleway.com/en/ai-supercomputers/) for a larger setup for workload-intensive tasks
33-
* Another way to scale your workload is to use [Kubernetes and MIG](/gpu/how-to/use-nvidia-mig-technology/): You can divide a single H100 or H100-SXM GPU into as many as 7 MIG partitions. This means that instead of employing seven P100 GPUs to set up seven K8S pods, you could opt for a single H100 GPU with MIG to effectively deploy all seven K8S pods.
33+
* Another way to scale your workload is to use [Kubernetes and MIG](/gpu/how-to/use-nvidia-mig-technology/): You can divide a single H100 or H100-SXM GPU into as many as 7 MIG partitions. This means that instead of employing seven P100 GPUs to set up seven K8S Pods, you could opt for a single H100 GPU with MIG to effectively deploy all seven K8S Pods.
3434
* **Online resources:** Check for online resources, forums, and community discussions related to the specific GPU type you are considering. This can provide insights into common issues, best practices, and optimizations.
3535

3636
Remember that there is no one-size-fits-all answer, and the right GPU Instance type will depend on your workload’s unique requirements and budget. It is important that you regularly reassess your choice as your workload evolves. Depending on which type best fits your evolving tasks, you can easily migrate from one GPU Instance type to another.

0 commit comments

Comments
 (0)