Skip to content

Conversation

@Kimahriman
Copy link
Contributor

@Kimahriman Kimahriman commented Jan 4, 2026

Which issue does this PR close?

Related to #174, not full support so probably should keep that open (or open new tickets specifically for column mapping and deletion vectors)

Rationale for this change

Delta scans are simply built-in Parquet scans when column mapping and deletion vectors are not used, so the existing Comet parquet scans work fine for them. This adds support for these cases in the scan rule and adds some unit tests that could be used for future integration of delta-rs or delta-kernel-rs for full scan support.

What changes are included in this PR?

Updates the CometScanRule to use Comet parquet scans for Delta V1 scans of tables that don't use column mapping or deletion vectors. Adds a new DeltaReflection class for accessing the Delta properties to determine the enabled features.

How are these changes tested?

Adds unit tests for different Delta scans and adds Delta as a test dependency.

reqwest = { version = "0.12", default-features = false, features = ["rustls-tls-native-roots", "http2"] }
object_store_opendal = {version = "0.55.0", optional = true}
hdfs-sys = {version = "0.3", optional = true, features = ["hdfs_3_3"]}
hdfs-sys = {version = "0.3", optional = true, features = ["hdfs_3_3", "vendored"]}
Copy link
Contributor Author

@Kimahriman Kimahriman Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With opendal enabled by default in #2929, I was getting failures finding libhdfs.so running unit tests on Linux. The hdrs dependency below for macos was purely activating the vendored feature of hdfs-sys, so I just enabled that globally here so you don't need libhdfs available for developing on Linux either

Comment on lines +115 to +120
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>failureaccess</artifactId>
<version>1.0.3</version>
<scope>test</scope>
</dependency>
Copy link
Contributor Author

@Kimahriman Kimahriman Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this I'm getting an error creating the Delta tables:

Cause: java.lang.ClassNotFoundException: com.google.common.util.concurrent.internal.InternalFutureFailureAccess
  at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
  at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
  at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
  at java.base/java.lang.ClassLoader.defineClass1(Native Method)
  at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
  at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
  at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
  at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
  at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
  at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
  ...

Not sure why, could use some help figuring out what is missing with Guava from where

@codecov-commenter
Copy link

codecov-commenter commented Jan 4, 2026

Codecov Report

❌ Patch coverage is 65.78947% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.60%. Comparing base (f09f8af) to head (aa51b22).
⚠️ Report is 832 commits behind head on main.

Files with missing lines Patch % Lines
...scala/org/apache/comet/delta/DeltaReflection.scala 66.66% 9 Missing ⚠️
...ala/org/apache/spark/sql/comet/CometScanExec.scala 63.63% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3035      +/-   ##
============================================
+ Coverage     56.12%   59.60%   +3.47%     
- Complexity      976     1379     +403     
============================================
  Files           119      168      +49     
  Lines         11743    15534    +3791     
  Branches       2251     2579     +328     
============================================
+ Hits           6591     9259    +2668     
- Misses         4012     4975     +963     
- Partials       1140     1300     +160     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove
Copy link
Member

Thanks, @Kimahriman. Please also add content to the documentation (either the user guide or the contributor guide) explaining this new feature.

@Kimahriman
Copy link
Contributor Author

Added a snippet to the user guide. There's not much for the contributor guide since this doesn't start the process of trying to use the delta rust libraries. Still trying to figure out if/how to integrate those

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants