Skip to content

Commit 2b03a3c

Browse files
Merge pull request #217884 from sreekzz/patch-87
Changed HDI to HDInsight
2 parents 234048e + 1963f78 commit 2b03a3c

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

articles/hdinsight/spark/apache-spark-improve-performance-iocache.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ ms.date: 11/09/2022
99
# Improve performance of Apache Spark workloads using Azure HDInsight IO Cache
1010

1111
> [!NOTE]
12-
> * IO Cache is only available for Spark 2.4(HDI 4.0).
13-
> * Spark 3.1.2 (HDI 5.0) doesn’t support IO Cache.
12+
> * IO Cache is only available for Spark 2.4(HDInsight 4.0).
13+
> * Spark 3.1.2 (HDInsight 5.0) doesn’t support IO Cache.
1414
1515
IO Cache is a data caching service for Azure HDInsight that improves the performance of Apache Spark jobs. IO Cache also works with [Apache TEZ](https://tez.apache.org/) and [Apache Hive](https://hive.apache.org/) workloads, which can be run on [Apache Spark](https://spark.apache.org/) clusters. IO Cache uses an open-source caching component called RubiX. RubiX is a local disk cache for use with big data analytics engines that access data from cloud storage systems. RubiX is unique among caching systems, because it uses Solid-State Drives (SSDs) rather than reserve operating memory for caching purposes. The IO Cache service launches and manages RubiX Metadata Servers on each worker node of the cluster. It also configures all services of the cluster for transparent use of RubiX cache.
1616

0 commit comments

Comments
 (0)