Skip to content

Commit bad4e3d

Browse files
add tpc-ds tests and property-based testing utilities
This change introduces a new `property_based.rs` test utility which lets us evaluate correctness using properties. These are useful for evaluating correctness when we do not know the expected output of a test (ex. if we were to fuzz the database with randomized data or randomzed queries, then we can only verify the output using properties). This change does not introduce fuzzing, but it introduces a TPC-DS test. This test randomly generates data using the duckdb CLI and runs 99 queries on a distributed cluster. The query outputs are validated against single-node datafusion using test utils in `metamorphic.rs`. This test also randomizes the test cluster parameters - there's no harm in doing so. Next steps: - Add ordering oracle to validate ORDER BY correctness - Idea: Inspect the ordering properties in the logical plan and assert this property on the `RecordBatch`es - Add fuzzing - Now that we have property-based testing utils, we can properly fuzz the project using SQLancer - SQLancer produces INSERT and SELECT statements which we could point at a datafusion distributed cluster and verify against single node datafusion - Although it doesn't support nested select statements, 70% of the queries were valid datafusion queries, meaning these are good test cases for us - Add metrics oracle to validate output_rows metric / metrics propagation
1 parent 83efd9e commit bad4e3d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

109 files changed

+6850
-530
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
/.idea
22
/target
33
/benchmarks/data/
4-
testdata/tpch/data/
4+
testdata/tpch/data/
5+
testdata/tpcds/data/

0 commit comments

Comments
 (0)