You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update README with comprehensive column parsing documentation
- Add column parsing to main features list with key capabilities:
alias chain tracking, nested struct field access, input/output distinction
- Document new column context types (select, where, function_arg, etc.)
- Add comprehensive parse_columns() function documentation with:
* Complete parameter and return value descriptions
* Basic column reference examples
* Alias chain parsing example showing dependency tracking
* Nested struct field access example
* Multi-table JOIN examples
- Update overview and limitations to include column parsing
- Add column_parser_examples.sql for demonstration
Column parsing provides complete SQL dependency analysis alongside
existing table and function parsing capabilities.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+78-2Lines changed: 78 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,15 +10,19 @@ An experimental DuckDB extension that exposes functionality from DuckDB's native
10
10
11
11
-**Extract table references** from a SQL query with context information (e.g. `FROM`, `JOIN`, etc.)
12
12
-**Extract function calls** from a SQL query with context information (e.g. `SELECT`, `WHERE`, `HAVING`, etc.)
13
+
-**Extract column references** from a SQL query with comprehensive dependency tracking
13
14
-**Parse WHERE clauses** to extract conditions and operators
14
15
- Support for **window functions**, **nested functions**, and **CTEs**
16
+
-**Alias chain tracking** for complex column dependencies
17
+
-**Nested struct field access** parsing (e.g., `table.column.field.subfield`)
18
+
-**Input vs output column distinction** for complete dependency analysis
15
19
- Includes **schema**, **name**, and **context** information for all extractions
16
20
- Built on DuckDB's native SQL parser
17
21
- Simple SQL interface — no external tooling required
18
22
19
23
20
24
## Known Limitations
21
-
- Only `SELECT` statements are supported for tableand function parsing
25
+
- Only `SELECT` statements are supported for table, function, and column parsing
22
26
- WHERE clause parsing supports additional statement types
23
27
- Full parse tree is not exposed (only specific structural elements)
24
28
@@ -92,9 +96,17 @@ Context helps identify where elements are used in the query.
92
96
-`group_by`: function in a `GROUP BY` clause
93
97
-`nested`: function call nested within another function
94
98
99
+
### Column Context
100
+
-`select`: column in a `SELECT` clause
101
+
-`where`: column in a `WHERE` clause
102
+
-`having`: column in a `HAVING` clause
103
+
-`order_by`: column in an `ORDER BY` clause
104
+
-`group_by`: column in a `GROUP BY` clause
105
+
-`function_arg`: column used as a function argument
106
+
95
107
## Functions
96
108
97
-
This extension provides parsing functions for tables, functions, and WHERE clauses. Each category includes both table functions (for detailed results) and scalar functions (for programmatic use).
109
+
This extension provides parsing functions for tables, functions, columns, and WHERE clauses. Each category includes both table functions (for detailed results) and scalar functions (for programmatic use).
98
110
99
111
In general, errors (e.g. Parse Exception) will not be exposed to the user, but instead will result in an empty result. This simplifies batch processing. When validity is needed, [is_parsable](#is_parsablesql_query--scalar-function) can be used.
100
112
@@ -190,6 +202,70 @@ SELECT list_filter(parse_functions('SELECT upper(name) FROM users WHERE lower(em
190
202
191
203
---
192
204
205
+
### Column Parsing Functions
206
+
207
+
These functions extract column references from SQL queries, providing comprehensive dependency tracking including alias chains, nested struct field access, and input/output column distinction.
208
+
209
+
#### `parse_columns(sql_query)` – Table Function
210
+
211
+
Parses a SQL `SELECT` query and returns all column references along with their context, schema qualification, and dependency information.
212
+
213
+
##### Usage
214
+
```sql
215
+
SELECT*FROM parse_columns('SELECT u.name, o.total FROM users u JOIN orders o ON u.id = o.user_id;');
216
+
```
217
+
218
+
##### Returns
219
+
A table with:
220
+
-`expression_identifiers`: JSON array of identifier paths (e.g., `[["u","name"]]` or `[["schema","table","column","field"]]`)
221
+
-`table_schema`: schema name for table columns (NULL for aliases/expressions)
222
+
-`table_name`: table name for table columns (NULL for aliases/expressions)
223
+
-`column_name`: column name for simple references (NULL for complex expressions)
224
+
-`context`: where the column appears in the query (select, where, function_arg, etc.)
225
+
-`expression`: full expression text as it appears in the SQL
226
+
-`selected_name`: output column name for SELECT items (NULL for input columns)
227
+
228
+
##### Basic Example
229
+
```sql
230
+
SELECT*FROM parse_columns('SELECT name, age FROM users;');
SELECT*FROM parse_columns('SELECT u.name, o.total, u.age + o.total AS score FROM users u JOIN orders o ON u.id = o.user_id WHERE u.status = "active";');
263
+
```
264
+
265
+
Shows columns from multiple tables with different contexts (select, function_arg, join conditions).
0 commit comments