Support partition management for HiveTable#7329
Support partition management for HiveTable#7329yabola wants to merge 8 commits intoapache:masterfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #7329 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 698 699 +1
Lines 43656 43724 +68
Branches 5896 5897 +1
======================================
- Misses 43656 43724 +68 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This pull request adds support for partition management operations to the Hive V2 Catalog by implementing the SupportsPartitionManagement interface from Spark's DataSource V2 API. This enables partition-related SQL commands like ALTER TABLE ... ADD PARTITION, ALTER TABLE ... DROP PARTITION, and SHOW PARTITIONS to work correctly with the HiveTable connector.
Changes:
- Implemented
SupportsPartitionManagementinterface inHiveTablewith methods for creating, dropping, loading metadata, and listing partitions - Added
HiveTablePropertiesobject to define partition property constants - Added
castExpressionutility method inHiveConnectorUtilsfor cross-version Spark compatibility - Added comprehensive test suite
PartitionManagementSuitewith V1 and V2 variants
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| PartitionManagementSuite.scala | New test suite covering partition creation, dropping, and listing operations for both V1 and V2 catalog implementations |
| HiveTableProperties.scala | New object defining the LOCATION constant for partition properties |
| HiveTable.scala | Implements SupportsPartitionManagement interface with partition management methods and helper functions |
| HiveConnectorUtils.scala | Adds castExpression helper method for type casting with cross-Spark-version compatibility |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...i-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveTable.scala
Outdated
Show resolved
Hide resolved
|
please test |
| test("create partition with location") { | ||
| withNamespaceAndTable("ns", "tbl") { t => | ||
| sql(s"CREATE TABLE $t (id string, year string, month string) PARTITIONED BY (year, month)") | ||
| val loc = "file:///tmp/kyuubi/hive_catalog_part_loc" |
There was a problem hiding this comment.
create a temp dir instead of relying on global /tmp
|
UT does not cover data types other than STRING, could you expand that? |
| override def replacePartitionMetadata( | ||
| ident: InternalRow, | ||
| properties: util.Map[String, String]): Unit = { | ||
| throw new UnsupportedOperationException("Replace partition is not supported") |
There was a problem hiding this comment.
| throw new UnsupportedOperationException("Replace partition is not supported") | |
| throw new UnsupportedOperationException("Replace partition metadata is not supported") |
| classOf[Option[_]]) | ||
| .build[Expression]() | ||
|
|
||
| // SPARK-40054, ensuring cross-version compatibility. |
There was a problem hiding this comment.
add fixed version alongside the JIRA ticket id.
Why are the changes needed?
The Hive V2 Catalog did not implement the
SupportsPartitionManagementinterface from Spark's DataSource V2 API. As a result, common partition-related SQL commands would fail ( likealter table ... add partitionshow partitionsetc.)How was this patch tested?
Add new UT