diff --git a/menu/navigation.json b/menu/navigation.json index 60cb6051ad..5eb4edb6d1 100644 --- a/menu/navigation.json +++ b/menu/navigation.json @@ -3011,7 +3011,7 @@ "slug": "troubleshooting" } ], - "label": "Distributed Data Lab", + "label": " Data Lab for Apache Spark™", "slug": "data-lab" } ], diff --git a/pages/audit-trail/reference-content/resource-integration-with-adt.mdx b/pages/audit-trail/reference-content/resource-integration-with-adt.mdx index 45a35c8eec..9fb9eb2025 100644 --- a/pages/audit-trail/reference-content/resource-integration-with-adt.mdx +++ b/pages/audit-trail/reference-content/resource-integration-with-adt.mdx @@ -39,7 +39,7 @@ The following table provides details about the Scaleway products that will be in | Block Storage | **Not integrated yet** | | Cockpit | **Not integrated yet** | | Container Registry | **Not integrated yet** | -| Distributed Data Lab | **Not integrated yet** | +| Data Lab for Apache Spark™ | **Not integrated yet** | | Domains and DNS | **Not integrated yet** | | Edge Services | **Not integrated yet** | | Elastic Metal | **Not integrated yet** | diff --git a/pages/data-lab/concepts.mdx b/pages/data-lab/concepts.mdx index 5989845d95..5fd2b68d79 100644 --- a/pages/data-lab/concepts.mdx +++ b/pages/data-lab/concepts.mdx @@ -1,11 +1,11 @@ --- meta: - title: Distributed Data Lab - Concepts - description: Learn the fundamental concepts of Scaleway Distributed Data Lab. + title: Data Lab for Apache Spark™ - Concepts + description: Learn the fundamental concepts of Scaleway Data Lab for Apache Spark™. content: - h1: Distributed Data Lab - Concepts - paragraph: Learn the fundamental concepts of Scaleway Distributed Data Lab. -tags: distributed data lab apache spark notebook jupyter processing + h1: Data Lab for Apache Spark™ - Concepts + paragraph: Learn the fundamental concepts of Scaleway Data Lab for Apache Spark™. +tags: data lab for apache spark notebook jupyter processing dates: validation: 2025-02-24 categories: @@ -16,9 +16,9 @@ categories: A Data Lab is a project setup that combines a Notebook and an Apache Spark Cluster for data analysis and experimentation. it comes with the required infrastructure and tools to allow data scientists, analysts, and researchers to explore data, create models, and gain insights. -## Distributed Data Lab +## Data Lab for Apache Spark™ -A Distributed Data Lab is a data lab that is distributed across multiple worker nodes to accelerate the processing of large datasets to save time and gain access to actionable insights faster. +A Data Lab for Apache Spark™ is a data lab that is distributed across multiple worker nodes to accelerate the processing of large datasets to save time and gain access to actionable insights faster. ## Fixture diff --git a/pages/data-lab/faq.mdx b/pages/data-lab/faq.mdx index 509b66bb48..17a797187e 100644 --- a/pages/data-lab/faq.mdx +++ b/pages/data-lab/faq.mdx @@ -1,9 +1,9 @@ --- meta: - title: Distributed Data Lab FAQ - description: Discover Scaleway Distributed Data Lab powered by Apache Spark, and how to use it. + title: Data Lab for Apache Spark™ FAQ + description: Discover Scaleway Data Lab for Apache Spark™ powered by Apache Spark, and how to use it. content: - h1: Distributed Data Lab FAQ + h1: Data Lab for Apache Spark™ FAQ dates: validation: 2025-02-18 category: managed-services @@ -12,9 +12,9 @@ productIcon: DistributedDataLabProductIcon ## General -### What workloads is Distributed Data Lab suited for? +### What workloads is Data Lab for Apache Spark™ suited for? -Distributed Data Lab supports a range of workloads, including: +Data Lab for Apache Spark™ supports a range of workloads, including: - Complex analytics. - Machine learning tasks. @@ -30,21 +30,21 @@ Apache Spark is an open-source unified analytics engine designed for large-scale Apache Spark processes data in memory, which allows it to perform tasks up to 100 times faster than traditional disk-based processing frameworks like [Hadoop MapReduce](https://fr.wikipedia.org/wiki/MapReduce). It uses Resilient Distributed Datasets (RDDs) to store data across multiple nodes in a cluster and perform parallel operations on this data. -### How am I billed for Distributed Data Lab? +### How am I billed for Data Lab for Apache Spark™? -Distributed Data Lab is billed based on two factors: +Data Lab for Apache Spark™ is billed based on two factors: - the main node configuration selected - the worker node configuration selected, and the number of worker nodes in the cluster ## Clusters -### Can I upscale or downscale a Distributed Data Lab? +### Can I upscale or downscale a Data Lab for Apache Spark™? Yes, you can upscale a Data Lab cluster to distribute your workloads across more worker nodes for faster processing. You can also scale it down to zero to reduce costs, while retaining your configuration and context. You can still access the notebook of a Data Lab cluster with zero worker nodes, but you cannot perform any calculations. You can resume the activity of your cluster by provisioning at least one worker node. -### Can I run a Distributed Data Lab using GPUs? +### Can I run a Data Lab for Apache Spark™ using GPUs? Yes, you can run your cluster on either CPUs or GPUs. Scaleway leverages Nvidia's [RAPIDS Accelerator For Apache Spark](https://www.nvidia.com/en-gb/deep-learning-ai/software/rapids/), an open-source suite of software libraries and APIs to execute end-to-end data science and analytics pipelines entirely on GPUs. This technology allows for significant acceleration of data processing tasks compared to CPU-based processing. diff --git a/pages/data-lab/how-to/connect-to-data-lab.mdx b/pages/data-lab/how-to/connect-to-data-lab.mdx index 14fc428391..85628f7a03 100644 --- a/pages/data-lab/how-to/connect-to-data-lab.mdx +++ b/pages/data-lab/how-to/connect-to-data-lab.mdx @@ -1,11 +1,11 @@ --- meta: - title: How to connect to a Distributed Data Lab - description: Step-by-step guide to connecting to a Distributed Data Lab with the Scaleway console. + title: How to connect to a Data Lab for Apache Spark™ + description: Step-by-step guide to connecting to a Data Lab for Apache Spark™ with the Scaleway console. content: - h1: How to connect to a Distributed Data Lab - paragraph: Step-by-step guide to connecting to a Distributed Data Lab with the Scaleway console. -tags: distributed data lab apache spark create process + h1: How to connect to a Data Lab for Apache Spark™ + paragraph: Step-by-step guide to connecting to a Data Lab for Apache Spark™ with the Scaleway console. +tags: data lab for apache spark create process dates: validation: 2025-02-24 posted: 2024-07-31 @@ -18,11 +18,10 @@ categories: - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization -- [Signed up to the private beta](https://www.scaleway.com/fr/betas/#distributed-data-lab) and received a confirmation email. -- Created a [Distributed Data Lab cluster](/data-lab/how-to/create-data-lab/) +- Created a [Data Lab for Apache Spark™ cluster](/data-lab/how-to/create-data-lab/) - A valid [API key](/iam/how-to/create-api-keys/) -1. Click **Data Lab** under **Managed Services** on the side menu. The Distributed Data Lab page displays. +1. Click **Data Lab** under **Managed Services** on the side menu. The Data Lab for Apache Spark™ page displays. 2. Click the name of the Data Lab cluster you want to connect to. The cluster **Overview** page displays. diff --git a/pages/data-lab/how-to/create-data-lab.mdx b/pages/data-lab/how-to/create-data-lab.mdx index ae00289c23..e30dfb3c67 100644 --- a/pages/data-lab/how-to/create-data-lab.mdx +++ b/pages/data-lab/how-to/create-data-lab.mdx @@ -1,11 +1,11 @@ --- meta: - title: How to create a Distributed Data Lab - description: Step-by-step guide to creating a Distributed Data Lab on Scaleway. + title: How to create a Data Lab for Apache Spark™ + description: Step-by-step guide to creating a Data Lab for Apache Spark™ on Scaleway. content: - h1: How to create a Distributed Data Lab - paragraph: Step-by-step guide to creating a Distributed Data Lab on Scaleway. -tags: distributed data lab apache spark create process + h1: How to create a Data Lab for Apache Spark™ + paragraph: Step-by-step guide to creating a Data Lab for Apache Spark™ on Scaleway. +tags: data lab apache spark create process dates: validation: 2025-02-24 posted: 2024-07-31 @@ -14,17 +14,16 @@ categories: - data-lab --- -Distributed Data Lab is a product designed to assist data scientists and data engineers in performing calculations on a remotely managed Apache Spark infrastructure. +Data Lab for Apache Spark™ is a product designed to assist data scientists and data engineers in performing calculations on a remotely managed Apache Spark infrastructure. - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization -- [Signed up to the private beta](https://www.scaleway.com/fr/betas/#distributed-data-lab) and received a confirmation email. - Optionally, an [Object Storage bucket](/object-storage/how-to/create-a-bucket/) - A valid [API key](/iam/how-to/create-api-keys/) -1. Click **Data Lab** under **Managed Services** on the side menu. The Distributed Data Lab page displays. +1. Click **Data Lab** under **Managed Services** on the side menu. The Data Lab for Apache Spark™ page displays. 2. Click **Create Data Lab cluster**. The creation wizard displays. diff --git a/pages/data-lab/how-to/index.mdx b/pages/data-lab/how-to/index.mdx index 27e744a75c..df8f9d3f6a 100644 --- a/pages/data-lab/how-to/index.mdx +++ b/pages/data-lab/how-to/index.mdx @@ -1,8 +1,8 @@ --- meta: - title: Distributed Data Lab - How Tos - description: Practical guides for using Scaleway Distributed Data Lab. + title: Data Lab for Apache Spark™ - How Tos + description: Practical guides for using Scaleway Data Lab for Apache Spark™. content: - h1: Distributed Data Lab - How Tos - paragraph: Practical guides for using Scaleway Distributed Data Lab. + h1: Data Lab for Apache Spark™ - How Tos + paragraph: Practical guides for using Scaleway Data Lab for Apache Spark™. --- \ No newline at end of file diff --git a/pages/data-lab/how-to/manage-delete-data-lab.mdx b/pages/data-lab/how-to/manage-delete-data-lab.mdx index f6990354d6..9eea46fc1c 100644 --- a/pages/data-lab/how-to/manage-delete-data-lab.mdx +++ b/pages/data-lab/how-to/manage-delete-data-lab.mdx @@ -1,11 +1,11 @@ --- meta: - title: How to manage and delete a Distributed Data Lab - description: Step-by-step guide to managing and deleting a Distributed Data Lab with the Scaleway console. + title: How to manage and delete a Data Lab for Apache Spark™ + description: Step-by-step guide to managing and deleting a Data Lab for Apache Spark™ with the Scaleway console. content: - h1: How to manage and delete a Distributed Data Lab - paragraph: Step-by-step guide to managing and deleting a Distributed Data Lab with the Scaleway console. -tags: distributed data lab apache spark delete remove suppress + h1: How to manage and delete a Data Lab for Apache Spark™ + paragraph: Step-by-step guide to managing and deleting a Data Lab for Apache Spark™ with the Scaleway console. +tags: data lab apache spark delete remove suppress dates: validation: 2025-02-24 posted: 2024-07-31 @@ -14,18 +14,17 @@ categories: - data-lab --- -This page explains how to manage and delete your Distributed Data Lab. +This page explains how to manage and delete your Data Lab for Apache Spark™. - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization -- [Signed up to the private beta](https://www.scaleway.com/fr/betas/#distributed-data-lab) and received a confirmation email. -- Created a [Distributed Data Lab cluster](/data-lab/how-to/create-data-lab/) +- Created a [Data Lab for Apache Spark™ cluster](/data-lab/how-to/create-data-lab/) -## How to manage a Distributed Data Lab +## How to manage a Data Lab for Apache Spark™ -1. Click **Data Lab** under **Managed Services** on the side menu. The Distributed Data Lab page displays. +1. Click **Data Lab** under **Managed Services** on the side menu. The Data Lab for Apache Spark™ page displays. 2. Click the name of the Data Lab cluster you want to manage. The overview tab of the cluster displays. From this view, you can see the configuration of your cluster. @@ -38,19 +37,19 @@ This page explains how to manage and delete your Distributed Data Lab. Provisioning zero worker nodes lets you retain and access you cluster and notebook configurations, but will not allow you to run calculations. - - [Delete your Data Lab cluster](/data-lab/how-to/manage-delete-data-lab/#how-to-delete-a-distributed-data-lab). + - [Delete your Data Lab cluster](/data-lab/how-to/manage-delete-data-lab/#how-to-delete-a-data-lab-for-apache-sparktm). -Once you have created a Data Lab cluster, you cannot edit certain parameters, such as the node type, or its computing resources. You must [create a new Data Lab cluster](/data-lab/how-to/create-data-lab/) instead. +Once you have created a Data Lab cluster, you cannot edit the node type. You must [create a new Data Lab cluster](/data-lab/how-to/create-data-lab/) instead. -## How to delete a Distributed Data Lab +## How to delete a Data Lab for Apache Spark™ - This action is irreversible and will permanently delete this Data Lab cluster and all its associated data. + This action is irreversible and will permanently delete this Data Lab cluster and its configuration. The data source will not be deleted. -1. Click **Data Lab** under **Managed Services** on the side menu. The Distributed Data Lab page displays. +1. Click **Data Lab** under **Managed Services** on the side menu. The Data Lab for Apache Spark™ page displays. 2. Click the name of the Data Lab cluster you want to delete. The **Overview** tab of the cluster displays. diff --git a/pages/data-lab/index.mdx b/pages/data-lab/index.mdx index d8b7925686..54b540dfd1 100644 --- a/pages/data-lab/index.mdx +++ b/pages/data-lab/index.mdx @@ -1,15 +1,15 @@ --- meta: - title: Distributed Data Lab Documentation - description: Dive into Scaleway Distributed Data Lab with our quickstart guides, how-tos, tutorials and more. + title: Data Lab for Apache Spark™ Documentation + description: Dive into Scaleway Data Lab for Apache Spark™ with our quickstart guides, how-tos, tutorials and more. --- ## Getting Started @@ -18,21 +18,21 @@ meta: diff --git a/pages/data-lab/quickstart.mdx b/pages/data-lab/quickstart.mdx index 0079aca014..a06166aec6 100644 --- a/pages/data-lab/quickstart.mdx +++ b/pages/data-lab/quickstart.mdx @@ -1,11 +1,11 @@ --- meta: - title: Distributed Data Lab - Quickstart - description: Get started with Scaleway Distributed Data Lab quickly and efficiently. + title: Data Lab for Apache Spark™ - Quickstart + description: Get started with Scaleway Data Lab for Apache Spark™ quickly and efficiently. content: - h1: Distributed Data Lab - Quickstart - paragraph: Get started with Scaleway Distributed Data Lab quickly and efficiently. -tags: distributed data lab apache spark notebook jupyter processing + h1: Data Lab for Apache Spark™ - Quickstart + paragraph: Get started with Scaleway Data Lab for Apache Spark™ quickly and efficiently. +tags: data lab apache spark notebook jupyter processing dates: validation: 2025-02-24 posted: 2024-07-10 @@ -14,7 +14,7 @@ categories: - data-lab --- -Distributed Data Lab is a product designed to assist data scientists and data engineers in performing calculations on a remotely managed Apache Spark infrastructure. +Data Lab for Apache Spark™ is a product designed to assist data scientists and data engineers in performing calculations on a remotely managed Apache Spark infrastructure. It is composed of the following: @@ -30,10 +30,9 @@ The notebook, although capable of performing some local computations, primarily - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization -- [Signed up to the private beta](https://www.scaleway.com/fr/betas/#distributed-data-lab) and received a confirmation email. - Optionally, an [Object Storage bucket](/object-storage/how-to/create-a-bucket/) -## How to create a Distributed Data Lab cluster +## How to create a Data Lab for Apache Spark™ cluster 1. Click **Data Lab** under **Managed Services** on the side menu. @@ -54,7 +53,7 @@ The notebook, although capable of performing some local computations, primarily ## How to connect to your Data Lab -1. Click **Data Lab** under **Managed Services** on the side menu. The Distributed Data Lab page displays. +1. Click **Data Lab** under **Managed Services** on the side menu. The Data Lab for Apache Spark™ page displays. 2. Click the name of the Data Lab cluster you want to connect to. The cluster **Overview** page displays. @@ -64,7 +63,7 @@ The notebook, although capable of performing some local computations, primarily ## How to run the demo file -Each Distributed Data Lab comes with a default `DatalabDemo.ipynb` demonstration file for testing purposes. This file contains a preconfigured notebook environment that requires no modification to run. +Each Data Lab for Apache Spark™ comes with a default `DatalabDemo.ipynb` demonstration file for testing purposes. This file contains a preconfigured notebook environment that requires no modification to run. Execute the cells in order to perform pre-determined operations on a dummy data set. @@ -104,9 +103,9 @@ Execute the cells in order to perform pre-determined operations on a dummy data Once initialized, the information of the Spark session displays. -You can now execute commands that will run on the resources defined when creating the Distributed Data Lab. +You can now execute commands that will run on the resources defined when creating the Data Lab for Apache Spark™. -## How to delete a Distributed Data Lab +## How to delete a Data Lab for Apache Spark™ This action is irreversible and will permanently delete this Data Lab cluster and all its associated data. diff --git a/pages/data-lab/troubleshooting/cannot-run-data-lab.mdx b/pages/data-lab/troubleshooting/cannot-run-data-lab.mdx index 1ab8cc39c7..089256abec 100644 --- a/pages/data-lab/troubleshooting/cannot-run-data-lab.mdx +++ b/pages/data-lab/troubleshooting/cannot-run-data-lab.mdx @@ -1,11 +1,11 @@ --- meta: - title: Troubleshooting Distributed Data Lab execution issues - description: This page helps you troubleshoot problems when you cannot execute calculations with your Distributed Data Lab cluster + title: Troubleshooting Data Lab for Apache Spark™ execution issues + description: This page helps you troubleshoot problems when you cannot execute calculations with your Data Lab for Apache Spark™ cluster content: - h1: Troubleshooting Distributed Data Lab execution issues - paragraph: This page helps you troubleshoot problems when you cannot execute calculations with your Distributed Data Lab cluster -tags: execution run distributed data-lab worker-node error cannot process issue troubleshooting solution + h1: Troubleshooting Data Lab for Apache Spark™ execution issues + paragraph: This page helps you troubleshoot problems when you cannot execute calculations with your Data Lab for Apache Spark™ cluster +tags: execution run data lab worker node error cannot process issue troubleshooting solution dates: validation: 2024-10-08 posted: 2024-10-08 @@ -17,7 +17,7 @@ categories: - A Scaleway account logged into the [console](https://console.scaleway.com) - [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization -- A [Distributed Data Lab](/data-lab/how-to/create-data-lab/) +- A [Data Lab for Apache Spark™](/data-lab/how-to/create-data-lab/) ## Timeout errors @@ -27,7 +27,7 @@ Executing calculations within the notebook returns a timeout error after a few m ### Cause -The Distributed Data Lab cluster has zero worker nodes provisioned and cannot raise any resource to perform the required operations. +The Data Lab for Apache Spark™ cluster has zero worker nodes provisioned and cannot raise any resource to perform the required operations. ### Solution diff --git a/pages/data-lab/troubleshooting/index.mdx b/pages/data-lab/troubleshooting/index.mdx index 8ff8590487..a545c898f1 100644 --- a/pages/data-lab/troubleshooting/index.mdx +++ b/pages/data-lab/troubleshooting/index.mdx @@ -1,32 +1,32 @@ --- meta: - title: Distributed Data Lab Troubleshooting - description: Troubleshoot common issues with Scaleway Distributed Data Lab. + title: Data Lab for Apache Spark™ Troubleshooting + description: Troubleshoot common issues with Scaleway Data Lab for Apache Spark™. content: - h1: Distributed Data Lab Troubleshooting - paragraph: Troubleshoot common issues with Scaleway Distributed Data Lab. + h1: Data Lab for Apache Spark™ Troubleshooting + paragraph: Troubleshoot common issues with Scaleway Data Lab for Apache Spark™. dates: posted: 2025-03-13 validation: 2025-03-13 --- ## Featured Pages -## Distributed Data Lab troubleshooting pages +## Data Lab for Apache Spark™ troubleshooting pages -- [Troubleshooting Distributed Data Lab execution issues](/data-lab/troubleshooting/cannot-run-data-lab/) +- [Troubleshooting Data Lab for Apache Spark™ execution issues](/data-lab/troubleshooting/cannot-run-data-lab/) diff --git a/pages/faq/index.mdx b/pages/faq/index.mdx index 7b754c9c05..d6f3f51ce3 100644 --- a/pages/faq/index.mdx +++ b/pages/faq/index.mdx @@ -79,7 +79,7 @@ content: