Skip to content

Commit e41047f

Browse files
authored
Update benchmarking.md
1 parent 9fdd784 commit e41047f

File tree

1 file changed

+6
-3
lines changed
  • content/learning-paths/servers-and-cloud-computing/spark-on-gcp

1 file changed

+6
-3
lines changed

content/learning-paths/servers-and-cloud-computing/spark-on-gcp/benchmarking.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,10 @@ For easier comparison, the benchmark results collected from the earlier run on t
334334
### Benchmarking comparison summary
335335
When you compare the benchmarking results you will notice that on the Google Axion C4A Arm-based instances:
336336

337-
- **Whole-stage code generation significantly boosts performance**, improving execution by up to **38×** (e.g., `agg w/o group` from 33.4s to 0.86s).
338-
- **Vectorized and row-based hash maps** consistently outperform non-codegen and traditional hashmap approaches, especially for aggregation with keys and complex data types (e.g., decimal keys: **6.8× faste**r with vectorized hashmap).
339-
- **Arm-based Spark shows strong hash performance**, with `fast hash` and `murmur3` achieving up to **3.3× better throughput** than `UnsafeRowhash`.
337+
- **Whole-stage code generation significantly boosts performance**, improving execution by up to **** (e.g., `agg w/o group` from 2728 ms to 856 ms).
338+
- **Aggregation with Keys**, across row-based and non-hashmap variants deliver ~1.7–5.4× speedups.
339+
For simple codegen+vectorized hashmap, x86 and Arm-based instances show similar performance.
340+
- **Arm-based Spark shows strong hash performance**, `murmur3` and `UnsafeRowhash` on Arm-based instances are ~3×–5× faster, with the aggregate hashmap ~6× faster; the `fast hash` path is roughly on par.
341+
342+
Overall, when whole-stage codegen and vectorized hashmap paths are used, you should see multi-fold speedups on the Google Axion C4A Arm-based instances.
340343

0 commit comments

Comments
 (0)