Update apache-spark-performance.md

santhoshravindran7 · web-flow · commit cb97fc2c3eee · 2024-05-24T10:42:36.000-07:00
diff --git a/articles/synapse-analytics/spark/apache-spark-performance.md b/articles/synapse-analytics/spark/apache-spark-performance.md
@@ -57,7 +57,10 @@ Spark provides its own native caching mechanisms, which can be used through diff
 Spark operates by placing data in memory, so managing memory resources is a key aspect of optimizing the execution of Spark jobs.  There are several techniques you can apply to use your cluster's memory efficiently.
 
 * Prefer smaller data partitions and account for data size, types, and distribution in your partitioning strategy.
-* Consider the newer, more efficient [Kryo data serialization](https://github.com/EsotericSoftware/kryo), rather than the default Java serialization.
+* In Synapse Spark (Runtime 3.1 or higher) you get Kryo data serialization enabled by default [Kryo data serialization](https://github.com/EsotericSoftware/kryo).
+* You can customize the kryoserializer buffer size based on your requirements using
+`// Set the desired property`
+`spark.conf.set("spark.kryoserializer.buffer.max", "256m") `
 * Monitor and tune Spark configuration settings.
 
 For your reference, the Spark memory structure and some key executor memory parameters are shown in the next image.