Skip to content

Conversation

@robtandy
Copy link
Contributor

@robtandy robtandy commented Mar 3, 2025

Working CI tpch validation tests

Should fix #11, and maybe eliminates the need for #33.

Now, when the main branch is pushed to, or a PR is opened against main, CI will run that will validate TPCH results at scale factor 1.

This test will let us ensure that we do not stray from correct query execution as we iterate. This test also executes over a matrix of python versions, 3.10, 3.11, 3.12 and ray versions 2.40, 2.41, 2.42.1, 2.43, to ensure that combinations of these pass.

A status indicator of whether the main branch passes tests is shown in the Readme.

Example execution of these tests can be seen in the fork: https://github.com/robtandy/datafusion-ray/actions/runs/13638548828

uv in addition to pip

  • Using uv for CI makes sense as its so much faster. Turns out its good for local dev too. I updated docs/contributing.md to document the workflow borrowing much from the datafusion-python readme and its workflow.

Other Small changes

  • rename --partitions-per-worker to --partitions-per-processor in tpcbench.py
  • return an error return code if one or more queries do not validate in tpcbench.py

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robtandy!

@andygrove andygrove merged commit 0d76298 into apache:main Mar 3, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataFusion query results tests in CI

2 participants