Skip to content

Commit 717c626

Browse files
committed
Replace sqlparser-rs with DuckDB native parser via C++ FFI
- Add C++ FFI layer using DuckDB's Parser class directly - Remove sqlparser-rs dependency from Cargo.toml - Add detection and rejection of non-decomposable aggregates (COUNT DISTINCT, MEDIAN, PERCENTILE) with clear error messages - Expand aggregate function support to all DuckDB aggregates (STDDEV, MEDIAN, STRING_AGG, etc.) - Add tests for DuckDB-specific functions (date functions, COALESCE, array functions, cast syntax) - Update README to remove "experimental" label - Document non-decomposable aggregate limitation
1 parent a683af1 commit 717c626

File tree

13 files changed

+3231
-487
lines changed

13 files changed

+3231
-487
lines changed

.claude/settings.local.json

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
{
22
"permissions": {
33
"allow": [
4-
"WebFetch(domain:arxiv.org)"
4+
"WebFetch(domain:arxiv.org)",
5+
"WebSearch",
6+
"WebFetch(domain:docs.rs)",
7+
"WebFetch(domain:github.com)",
8+
"WebFetch(domain:pganalyze.com)"
59
],
610
"deny": [],
711
"ask": []

CMakeLists.txt

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ set(LOADABLE_EXTENSION_NAME ${TARGET_NAME}_loadable_extension)
1111

1212
project(${TARGET_NAME})
1313
include_directories(src/include)
14+
include_directories(include)
1415

1516
# Detect Rust target based on platform
1617
execute_process(
@@ -82,7 +83,10 @@ corrosion_import_crate(MANIFEST_PATH "${CMAKE_CURRENT_SOURCE_DIR}/yardstick-rs/C
8283
# Include Rust library headers
8384
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/yardstick-rs/include)
8485

85-
set(EXTENSION_SOURCES src/yardstick_extension.cpp)
86+
set(EXTENSION_SOURCES
87+
src/yardstick_extension.cpp
88+
src/yardstick_parser_ffi.cpp
89+
)
8690

8791
build_static_extension(${TARGET_NAME} ${EXTENSION_SOURCES})
8892
build_loadable_extension(${TARGET_NAME} " " ${EXTENSION_SOURCES})

LIMITATIONS.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,30 @@ Implementation of Julian Hyde's "Measures in SQL" paper (arXiv:2406.00251).
1515
- `AT (VISIBLE)` - uses visible WHERE clause filters
1616
- Multiple measures in same view
1717
- Arithmetic with AGGREGATE results (ratios, percentages, differences)
18-
- SUM, COUNT, MIN, MAX, AVG aggregations
18+
- All DuckDB aggregate functions (SUM, COUNT, AVG, MIN, MAX, STDDEV, MEDIAN, etc.)
1919
- Derived measures: `revenue - cost AS MEASURE profit` expands to `SUM(revenue) - SUM(cost)`
2020
- Multi-fact JOINs: measures from different views can be queried together in a single JOIN
2121

2222
## Known Limitations
2323

24-
### 1. No Window Function Measures
24+
### 1. Non-Decomposable Aggregates
25+
26+
```sql
27+
-- NOT SUPPORTED with AGGREGATE(): COUNT(DISTINCT), MEDIAN, PERCENTILE, MODE
28+
CREATE VIEW v AS
29+
SELECT region, COUNT(DISTINCT customer_id) AS MEASURE unique_customers
30+
FROM orders;
31+
32+
-- Direct query works fine:
33+
SELECT region, unique_customers FROM v;
34+
35+
-- But AGGREGATE() fails (cannot re-aggregate):
36+
SEMANTIC SELECT region, AGGREGATE(unique_customers) FROM v; -- ERROR
37+
```
38+
39+
Non-decomposable aggregates like COUNT(DISTINCT), MEDIAN, and PERCENTILE cannot be re-aggregated. Query these views directly without AGGREGATE().
40+
41+
### 2. No Window Function Measures
2542

2643
```sql
2744
-- NOT SUPPORTED: Window functions in measure definitions

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Yardstick
22

3-
An experimental DuckDB extension implementing Julian Hyde's "Measures in SQL" paper ([arXiv:2406.00251](https://arxiv.org/abs/2406.00251)).
3+
A DuckDB extension implementing Julian Hyde's "Measures in SQL" paper ([arXiv:2406.00251](https://arxiv.org/abs/2406.00251)).
44

55
## What is this?
66

@@ -107,7 +107,7 @@ SELECT
107107
FROM table;
108108
```
109109

110-
Yardstick automatically handles the grouping. Supported aggregations: `SUM`, `COUNT`, `AVG`, `MIN`, `MAX`
110+
Yardstick automatically handles the grouping. All DuckDB aggregate functions are supported.
111111

112112
### Querying Measures
113113

@@ -152,6 +152,7 @@ The extension will be at `build/release/extension/yardstick/yardstick.duckdb_ext
152152
See [LIMITATIONS.md](LIMITATIONS.md) for known issues and workarounds.
153153

154154
Key limitations:
155+
- Non-decomposable aggregates (COUNT DISTINCT, MEDIAN, PERCENTILE) cannot use AGGREGATE()
155156
- Window function measures not supported
156157

157158
## References

0 commit comments

Comments
 (0)