Skip to content

Commit 9aa535b

Browse files
committed
docs(dwc): update concept
1 parent 1f14d7f commit 9aa535b

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

pages/data-lab/concepts.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@ A notebook for an Apache Spark cluster is an interactive, web-based tool that al
4848

4949
A Persistent Volume (PV) is a cluster-wide storage resource that ensures data persistence beyond the lifecycle of individual pods. Persistent volumes abstract the underlying storage details, allowing administrators to use various storage solutions.
5050

51+
Apache Spark® executors require storage space for various operations, particularly to shuffle data during wide operations such as sorting, grouping, and aggregation. Wide operations are transformations that require data from different partitions to be combined, often resulting in data movement across the cluster. During the map phase, executors write data to shuffle storage, which is then read by reducers.
52+
53+
A PV sized properly ensures a smooth execution of your workload.
54+
5155
## SparkMagic
5256

5357
SparkMagic is a set of tools that allows you to interact with Apache Spark clusters through Jupyter notebooks. It provides magic commands for running Spark jobs, querying data, and managing Spark sessions directly within the notebook interface, facilitating seamless integration and execution of Spark tasks. For more details, check out the [SparkMagic repository](https://github.com/jupyter-incubator/sparkmagic).

0 commit comments

Comments
 (0)