Skip to content

Commit 0e17670

Browse files
authored
Merge pull request #2246 from madeline-underwood/apache_spark
Apache spark_PV to sign off
2 parents b1080b7 + bf81f77 commit 0e17670

File tree

6 files changed

+122
-100
lines changed

6 files changed

+122
-100
lines changed
Lines changed: 24 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,19 @@
11
---
22
title: Deploy Apache Spark on Google Axion processors
3-
4-
draft: true
5-
cascade:
6-
draft: true
7-
3+
84
minutes_to_complete: 60
95

10-
who_is_this_for: This is an introductory topic for the software developers interested in migrating their Apache Spark workloads from x86_64 platforms to Arm-based platforms, or on Google Axion based C4A virtual machines specifically.
6+
who_is_this_for: This introductory topic is for software developers interested in migrating their Apache Spark workloads from x86_64 platforms to Arm-based platforms, specifically on Google Axionbased C4A virtual machines.
117

128
learning_objectives:
13-
- Start an Arm virtual machine on the Google Cloud Platform using the C4A Google Axion instance family with RHEL 9 as the base image.
14-
- Learn how to install and configure Apache Spark on Arm-based GCP C4A instances.
15-
- Validate the functionality of spark through baseline testing.
16-
- Benchmark Apache Spark’s performance on Arm.
9+
- Start an Arm virtual machine on Google Cloud Platform (GCP) using the C4A Google Axion instance family with RHEL 9 as the base image
10+
- Install and configure Apache Spark on Arm-based GCP C4A instances
11+
- Validate Spark functionality through baseline testing
12+
- Benchmark Apache Spark performance on Arm
1713

1814
prerequisites:
19-
- A [Google Cloud Platform (GCP)](https://cloud.google.com/free?utm_source=google&hl=en) account with billing enabled.
20-
- Familiarity with distributed computing concepts and the [Apache Spark architecture](https://spark.apache.org/docs/latest/).
15+
- A [Google Cloud Platform (GCP)](https://cloud.google.com/free?utm_source=google&hl=en) account with billing enabled
16+
- Familiarity with distributed computing concepts and the [Apache Spark architecture](https://spark.apache.org/docs/latest/)
2117

2218
author: Pareena Verma
2319

@@ -27,36 +23,37 @@ subjects: Performance and Architecture
2723
cloud_service_providers: Google Cloud
2824

2925
armips:
30-
- Neoverse
26+
- Neoverse
3127

3228
tools_software_languages:
3329
- Apache Spark
3430
- Python
3531

3632
operatingsystems:
37-
- Linux
33+
- Linux
3834

3935
# ================================================================================
4036
# FIXED, DO NOT MODIFY
4137
# ================================================================================
4238
further_reading:
43-
- resource:
44-
title: Google Cloud official website and documentation
45-
link: https://cloud.google.com/docs
46-
type: documentation
47-
48-
- resource:
49-
title: Spark official website and documentation
50-
link: https://spark.apache.org/
51-
type: documentation
39+
- resource:
40+
title: Google Cloud official documentation
41+
link: https://cloud.google.com/docs
42+
type: documentation
5243

53-
- resource:
54-
title: The Scala programming language official website
55-
link: https://scala-lang.org
56-
type: website
44+
- resource:
45+
title: Apache Spark documentation
46+
link: https://spark.apache.org/
47+
type: documentation
5748

49+
- resource:
50+
title: Scala programming language official website
51+
link: https://scala-lang.org
52+
type: website
5853

5954
weight: 1 # _index.md always has weight of 1 to order correctly
6055
layout: "learningpathall" # All files under learning paths have this same wrapper
6156
learning_path_main_page: "yes" # Indicates this should be surfaced when looking for related content. Only set for _index.md of learning path content.
6257
---
58+
59+

content/learning-paths/servers-and-cloud-computing/spark-on-gcp/background.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
---
2-
title: "Google Axion C4A and Apache Spark"
2+
title: Getting started with Apache Spark on Google Axion C4A (Arm Neoverse-V2)
33

44
weight: 2
55

66
layout: "learningpathall"
77
---
88

9-
## Google Axion C4A instances
9+
## Google Axion C4A Arm instances in Google Cloud
1010

11-
Google Axion C4A is a family of Arm-based virtual machines built on Google’s custom Axion CPU, which is based on Arm Neoverse-V2 cores. Designed for high-performance and energy-efficient computing, these virtual machine offer strong performance ideal for modern cloud workloads such as CI/CD pipelines, microservices, media processing, and general-purpose applications.
11+
Google Axion C4A is a family of Arm-based virtual machines built on Google’s custom Axion CPU, which is based on Arm Neoverse-V2 cores. Designed for high-performance and energy-efficient computing, these virtual machines offer strong performance for modern cloud workloads such as CI/CD pipelines, microservices, media processing, and general-purpose applications.
1212

1313
The C4A series provides a cost-effective alternative to x86 virtual machines while leveraging the scalability and performance benefits of the Arm architecture in Google Cloud.
1414

1515
To learn more about Google Axion, refer to the [Introducing Google Axion Processors, our new Arm-based CPUs](https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu) blog.
1616

17-
## Apache Spark
17+
## Apache Spark for big data processing on Arm
1818

1919
Apache Spark is an open-source, distributed computing system designed for fast and general-purpose big data processing.
2020

Lines changed: 39 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,64 @@
11
---
2-
title: Baseline Testing
2+
title: Apache Spark baseline testing on Google Axion C4A Arm VM
33
weight: 5
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
8+
## Validate Apache Spark installation with a baseline test
89

10+
With Apache Spark installed successfully on your GCP C4A Arm-based virtual machine, you can now perform simple baseline testing to validate that Spark runs correctly and produces the expected output.
911

10-
With Apache Spark installed successfully on your GCP C4A Arm-based virtual machine, you can now perform simple baseline testing to validate that Spark runs correctly and gives expected output.
12+
## Run a baseline test for Apache Spark on Arm
1113

12-
## Spark Baseline Test
14+
Use a text editor of your choice to create a simple Spark job file:
1315

14-
Using a file editor of your choice, create a simple Spark job file:
1516
```console
1617
nano ~/spark_baseline_test.scala
1718
```
18-
Copy the content below into `spark_baseline_test.scala`:
1919

20-
```console
21-
val data = Seq(1, 2, 3, 4, 5)
22-
val distData = spark.sparkContext.parallelize(data)
23-
24-
// Basic transformation and action
25-
val squared = distData.map(x => x * x).collect()
26-
27-
println("Squared values: " + squared.mkString(", "))
20+
Add the following code to `spark_baseline_test.scala`:
21+
22+
```scala
23+
val data = Seq(1, 2, 3, 4, 5)
24+
val distData = spark.sparkContext.parallelize(data)
25+
26+
// Basic transformation and action
27+
val squared = distData.map(x => x * x).collect()
28+
29+
println("Squared values: " + squared.mkString(", "))
2830
```
29-
This is a basic Apache Spark example in Scala, demonstrating how to create an RDD (Resilient Distributed Dataset), perform a transformation, and collect results.
3031

31-
Lets look into the code, step by step:
32+
This Scala example shows how to create an RDD (Resilient Distributed Dataset), apply a transformation, and collect results.
3233

33-
- **val data = Seq(1, 2, 3, 4, 5)** : Creates a local Scala sequence of integers.
34-
- **val distData = spark.sparkContext.parallelize(data)** : Uses parallelize to convert the local sequence into a distributed RDD (so Spark can operate on it in parallel across cluster nodes or CPU cores).
35-
- **val squared = distData.map(x => x * x).collect()** : `map(x => x * x)` squares each element in the list, `.collect()` brings all the transformed data back to the driver program as a regular Scala collection.
36-
- **println("Squared values: " + squared.mkString(", "))** : Prints the squared values, joined by commas.
34+
Here’s a step-by-step breakdown of the code:
3735

36+
- **`val data = Seq(1, 2, 3, 4, 5)`**: Creates a local Scala sequence of integers
37+
- **`val distData = spark.sparkContext.parallelize(data)`**: Converts the local sequence into a distributed RDD, so Spark can process it in parallel across CPU cores or cluster nodes
38+
- **`val squared = distData.map(x => x * x).collect()`**: Squares each element using `map`, then gathers results back to the driver program with `collect`
39+
- **`println("Squared values: " + squared.mkString(", "))`**: Prints the squared values as a comma-separated list
3840

39-
### Run the Test in Spark Shell
41+
## Run the Apache Spark baseline test in Spark shell
42+
43+
Run the test file in the interactive Spark shell:
4044

41-
Run the test you created in the interactive shell:
4245
```console
43-
spark-shell < ~/spark_baseline_test.scala
46+
spark-shell < ~/spark_baseline_test.scala
4447
```
45-
The output should look similar to:
48+
49+
Alternatively, you can start the spark shell and then load the file from inside the shell:
50+
51+
```console
52+
spark-shell
53+
```
54+
```scala
55+
:load spark_baseline_test.scala
56+
```
57+
58+
You should see output similar to:
59+
4660
```output
4761
Squared values: 1, 4, 9, 16, 25
4862
```
49-
This confirms that Spark is working correctly with its driver, executor, and cluster manager in local mode.
50-
63+
64+
This confirms that Spark is running correctly in local mode with its driver, executor, and cluster manager.

content/learning-paths/servers-and-cloud-computing/spark-on-gcp/benchmarking.md

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
---
2-
title: Run Spark Benchmarks
2+
title: Apache Spark performance benchmarks on Arm64 and x86_64 in Google Cloud
33
weight: 6
44

5-
### FIXED, DO NOT MODIFY
65
layout: learningpathall
76
---
87

9-
## Apache Spark Benchmarking
8+
## How to run Apache Spark benchmarks on Arm64 in GCP
109
Apache Spark includes internal micro-benchmarks to evaluate the performance of core components like SQL execution, aggregation, joins, and data source reads. These benchmarks are helpful for comparing performance on x86_64 vs Arm64 platforms.
1110

1211
Follow the steps outlined to run Spark’s built-in SQL benchmarks using the SBT-based framework.
@@ -33,9 +32,11 @@ This compiles Spark and its dependencies, enabling the benchmarks build profile
3332
```console
3433
./build/sbt -Pbenchmarks "sql/test:runMain org.apache.spark.sql.execution.benchmark.AggregateBenchmark"
3534
```
36-
This executes the `AggregateBenchmark`, which compares performance of SQL aggregation operations (e.g., SUM, STDDEV) with and without `WholeStageCodegen`. `WholeStageCodegen` is an optimization technique used by Spark SQL to improve the performance of query execution by generating Java bytecode for entire query stages (aka whole stages) instead of interpreting them step-by-step.
35+
This executes the `AggregateBenchmark`, which compares performance of SQL aggregation operations (e.g., SUM, STDDEV) with and without `WholeStageCodegen`. `WholeStageCodegen` is an optimization technique used by Spark SQL to improve the performance of query execution by generating Java bytecode for entire query stages instead of interpreting them step-by-step.
3736

37+
## Example Apache Spark benchmark output (Arm64)
3838
You should see output similar to:
39+
3940
```output
4041
[info] Running benchmark: agg w/o group
4142
[info] Running case: agg w/o group wholestage off
@@ -235,7 +236,7 @@ You should see output similar to:
235236
[success] Total time: 669 s (11:09), completed Jul 24, 2025, 5:41:24 AM
236237
237238
```
238-
### Benchmark Results Table Explained:
239+
## Understanding Apache Spark benchmark metrics and results
239240

240241
- **Best Time (ms):** Fastest execution time observed (in milliseconds).
241242
- **Avg Time (ms):** Average time across all iterations.
@@ -244,7 +245,7 @@ You should see output similar to:
244245
- **Per Row (ns):** Average time taken per row (in nanoseconds).
245246
- **Relative Speed comparison:** baseline (1.0X) is the slower version.
246247

247-
### Benchmark summary on `x86_64`:
248+
## Apache Spark performance benchmark results on x86_64
248249
The following benchmark results were collected by running the same benchmark on a `c3-standard-4` (4 vCPU, 2 core, 16 GB Memory) x86_64 virtual machine in GCP, running RHEL 9.
249250

250251
| **Benchmark Case** | **Sub-Case / Config** | **Best Time (ms)** | **Avg Time (ms)** | **Stdev (ms)** | **Rate (M/s)** | **Per Row (ns)** | **Relative** |
@@ -293,9 +294,10 @@ The following benchmark results were collected by running the same benchmark on
293294
| BytesToBytesMap | BytesToBytesMap (on Heap) | 624 | 627 | 3 | 33.6 | 29.8 | 0.3X |
294295
| BytesToBytesMap | Aggregate HashMap | 31 | 31 | 0 | 680.7 | 1.5 | 6.6X |
295296

297+
---
296298

297-
### Benchmark summary on Arm64:
298-
For easier comparison, the benchmark results collected from the earlier run on the `c4a-standard-4` (4 vCPU, 16 GB Memory) virtual machine, running RHEL 9 is summarized below:
299+
## Apache Spark performance benchmark results on Arm64
300+
Results from the earlier run on the `c4a-standard-4` (4 vCPU, 16 GB memory) Arm64 VM in GCP (RHEL 9):
299301

300302
| Benchmark Case | Sub-Case / Config | Best Time (ms) | Avg Time (ms) | Stdev (ms) | Rate (M/s) | Per Row (ns) | Relative |
301303
|----------------------------|--------------------------|----------------|----------------|------------|-------------|----------------|-----------|
@@ -331,13 +333,15 @@ For easier comparison, the benchmark results collected from the earlier run on t
331333
| BytesToBytesMap | fast hash | 42 | 42 | 0 | 499.2 | 2.0 | 3.3X |
332334
| BytesToBytesMap |Aggregate HashMap | 23 | 23 | 0 | 913.0 | 1.1 | 5.9X |
333335

334-
### Benchmarking comparison summary
336+
---
337+
338+
## Apache Spark performance benchmarking comparison on Arm64 and x86_64
335339
When you compare the benchmarking results you will notice that on the Google Axion C4A Arm-based instances:
336340

337341
- **Whole-stage code generation significantly boosts performance**, improving execution by up to **** (e.g., `agg w/o group` from 2728 ms to 856 ms).
338342
- **Aggregation with Keys**, across row-based and non-hashmap variants deliver ~1.7–5.4× speedups.
339-
For simple codegen+vectorized hashmap, x86 and Arm-based instances show similar performance.
340343
- **Arm-based Spark shows strong hash performance**, `murmur3` and `UnsafeRowhash` on Arm-based instances are ~3×–5× faster, with the aggregate hashmap ~6× faster; the `fast hash` path is roughly on par.
341344

342345
Overall, when whole-stage codegen and vectorized hashmap paths are used, you should see multi-fold speedups on the Google Axion C4A Arm-based instances.
343346

347+
Lines changed: 17 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,30 @@
11
---
2-
title: Create a Google Axion C4A Arm virtual machine
2+
title: How to create a Google Axion C4A Arm virtual machine on GCP
33
weight: 3
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
## Introduction
9+
## How to create a Google Axion C4A Arm VM on Google Cloud
1010

11-
In this section you will learn how to provision a **Google Axion C4A Arm virtual machine** on GCP with the **c4a-standard-4 (4 vCPUs, 16 GB Memory)** machine type, using the **Google Cloud Console**.
11+
In this section, you learn how to provision a **Google Axion C4A Arm virtual machine** on Google Cloud Platform (GCP) using the **c4a-standard-4 (4 vCPUs, 16 GB memory)** machine type in the **Google Cloud Console**.
1212

13-
For more details, kindly follow the Learning Path on [Getting Started with Google Cloud Platform](https://learn.arm.com/learning-paths/servers-and-cloud-computing/csp/google/).
13+
For background on GCP setup, see the Learning Path [Getting started with Google Cloud Platform](https://learn.arm.com/learning-paths/servers-and-cloud-computing/csp/google/).
1414

15-
### Create an Arm-based Virtual Machine (C4A)
15+
### Create a Google Axion C4A Arm VM in Google Cloud Console
1616

1717
To create a virtual machine based on the C4A Arm architecture:
1818
1. Navigate to the [Google Cloud Console](https://console.cloud.google.com/).
19-
2. Go to **Compute Engine > VM Instances** and click on **Create Instance**.
20-
3. Under the **Machine Configuration**:
21-
- Fill in basic details like **Instance Name**, **Region**, and **Zone**.
22-
- Choose the **Series** as `C4A`.
23-
- Select a machine type such as `c4a-standard-4`.
24-
![Instance Screenshot](./image1.png)
25-
4. Under the **OS and Storage**, click on **Change**, and select Arm64 based OS Image of your choice. For this Learning Path, choose **Red Hat Enterprise Linux** as the Operating System with **Red Hat Enterprise Linux 9** as the Version. Make sure you pick the version of image for Arm. Click on the **Select**.
26-
5. Under **Networking**, enable **Allow HTTP traffic** to allow HTTP communication.
27-
6. Click on **Create**, and the instance will launch.
19+
2. Go to **Compute Engine > VM Instances** and select **Create Instance**.
20+
3. Under **Machine configuration**:
21+
- Enter details such as **Instance name**, **Region**, and **Zone**.
22+
- Set **Series** to `C4A`.
23+
- Select a machine type such as `c4a-standard-4`.
24+
25+
![Create a Google Axion C4A Arm virtual machine in the Google Cloud Console with c4a-standard-4 selected alt-text#center](./image1.png "Google Cloud Console – creating a Google Axion C4A Arm virtual machine")
26+
27+
4. Under **OS and Storage**, select **Change**, then choose an Arm64-based OS image.
28+
For this Learning Path, use **Red Hat Enterprise Linux 9**. Ensure you select the **Arm image** variant. Click **Select**.
29+
5. Under **Networking**, enable **Allow HTTP traffic**.
30+
6. Click **Create** to launch the instance.

0 commit comments

Comments
 (0)