-| HDInsight 3.6 Spark 2.1 to HDInsight 4.0 Spark 2.4 | Recreate clusters with HDInsight 4.0 Spark 2.4 | Review the following articles: <br> [Apache Spark: Upgrading From Spark SQL 2.3 to 2.4](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to-24) <br><br> [Apache Spark: Upgrading From Spark SQL 2.2 to 2.3](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-22-to-23) <br><br> [Apache Spark: Upgrading From Spark SQL 2.1 to 2.2](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-21-to-22) | Spark Hive Integration has changed in HDInsight 4.0. <br><br> In HDInsight 4.0, Spark and Hive use independent catalogs for accessing SparkSQL or Hive tables. A table created by Spark lives in the Spark catalog. A table created by Hive lives in the Hive catalog. This behavior is different than HDInsight 3.6 where Hive and Spark shared common catalog. Hive and Spark Integration in HDInsight 4.0 relies on Hive Warehouse Connector (HWC). HWC works as a bridge between Spark and Hive. Learn about Hive Warehouse Connector. <br> In HDInsight 4.0 if you would like to Share the metastore between Hive and Spark, you can do so by changing the property metastore.catalog.default to hive in your Spark cluster. You can find this property in Ambari Advanced spark2-hive-site-override. It’s important to understand that sharing of metastore only works for external hive tables, this will not work if you have internal/managed hive tables or ACID tables. <br><br>Read [Migrate Azure HDInsight 3.6 Hive workloads to HDInsight 4.0](../interactive-query/apache-hive-migrate-workloads) for more information.<br><br> |
0 commit comments