Commit a405d3f
Support nested field access in
## Summary
This PR extends `get_field` to accept multiple field name arguments for
nested struct/map access, enabling `get_field(col, 'a', 'b', 'c')` as
equivalent to `col['a']['b']['c']`.
**The primary motivation is to make it easier for downstream
optimizations to match on and optimize struct/map field access
patterns.** By representing `col['a']['b']['c']` as a single
`get_field(col, 'a', 'b', 'c')` call rather than nested
`get_field(get_field(get_field(col, 'a'), 'b'), 'c')` calls,
optimization rules can more easily identify and transform field access
patterns.
This is related / maybe prep work for #19387 but I think is a good
improvement in its own right.
## Changes
- **Variadic signature**: `get_field` now accepts 2+ arguments (base +
one or more field names)
- **Type validation at planning time**: Accessing a field on a
non-struct/map type (e.g., `get_field({a: 1}, 'a', 'b')`) fails during
planning with a clear error message indicating which argument position
caused the failure
- **Bracket syntax optimization**: The `FieldAccessPlanner` now merges
consecutive bracket accesses into a single `get_field` call (e.g.,
`s['a']['b']` → `get_field(s, 'a', 'b')`)
- **Mixed access handling**: Array index access correctly breaks the
batching (e.g., `s['a'][0]['b']` → `get_field(array_element(get_field(s,
'a'), 0), 'b')`)
## Example
```sql
-- Direct function call with nested access
SELECT get_field(my_struct, 'outer', 'inner', 'value');
-- Equivalent bracket syntax (now optimized to single get_field)
SELECT my_struct['outer']['inner']['value'];
-- EXPLAIN shows single get_field call
EXPLAIN SELECT s['a']['b'] FROM t;
-- Projection: get_field(t.s, Utf8("a"), Utf8("b"))
```
## Backwards Compatibility
- The original 2-argument form `get_field(struct, 'field')` continues to
work unchanged
- Existing queries using bracket syntax will automatically benefit from
the optimization
## Test plan
- [x] Backwards compatibility test for 2-argument form
- [x] Multi-level get_field with 2, 3, and 5 levels of nesting
- [x] Type validation error tests at argument positions 2, 3, 4
- [x] Non-existent field error tests
- [x] Null handling (null at base, null in middle of chain)
- [x] Mixed array/struct access (verifies array index breaks batching)
- [x] Nullable parent propagation
- [x] EXPLAIN test verifying single get_field call for bracket syntax
- [x] Minimum argument validation (0 and 1 argument cases)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <[email protected]>get_field with multiple path arguments (#19389)1 parent 47ddd50 commit a405d3f
File tree
6 files changed
+665
-264
lines changed- datafusion
- functions-nested/src
- functions/src/core
- sqllogictest/test_files
- sql/src/unparser
- docs/source/user-guide/sql
6 files changed
+665
-264
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
151 | 154 | | |
152 | 155 | | |
153 | 156 | | |
| |||
0 commit comments