Skip to content

Conversation

mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Aug 19, 2025

Which issue does this PR close?

Closes #2195, #1348.

Rationale for this change

We previously fell back in a scenario where the native reader produced different results. We've now picked up appropriate DF and arrow-rs dependencies to have apache/arrow#7055.

What changes are included in this PR?

Remove the fallback.

How are these changes tested?

Existing tests.

@mbutrovich
Copy link
Contributor Author

mbutrovich commented Aug 19, 2025

Running Spark SQL tests locally with iceberg_compat. If that passes, I'll likely want to undo #1376 and try again.

@codecov-commenter
Copy link

codecov-commenter commented Aug 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.71%. Comparing base (f09f8af) to head (92f4c47).
⚠️ Report is 395 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2197      +/-   ##
============================================
+ Coverage     56.12%   57.71%   +1.58%     
- Complexity      976     1238     +262     
============================================
  Files           119      143      +24     
  Lines         11743    13191    +1448     
  Branches       2251     2357     +106     
============================================
+ Hits           6591     7613    +1022     
- Misses         4012     4362     +350     
- Partials       1140     1216      +76     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@parthchandra
Copy link
Contributor

You may want to undo parts of #1376 to really test the change. The changes to CometTestBase.getPrimitiveTypesParquetSchema were so that the schema would not contain unsigned types if the reader type was one of the datafusion based readers. Undoing that change will be a primary test.

@mbutrovich
Copy link
Contributor Author

So instead of getting null on the incorrectly written values we now get different values. So garbage in is still garbage out, but it's new garbage!

@parthchandra
Copy link
Contributor

I seem to recall that I tested a badly written file with apache/arrow#7055 and did not get garbage. I'll take a look at it again.
Thanks for kicking off this effort @mbutrovich!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

follow up - remove incompatibility limitation on unsigned int types in native_iceberg_compat and native_datafusion readers
3 participants