Skip to content

Commit c7ffc53

Browse files
committed
improve readme
1 parent 3955637 commit c7ffc53

File tree

1 file changed

+102
-14
lines changed

1 file changed

+102
-14
lines changed

README.md

Lines changed: 102 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ An experimental DuckDB extension that exposes functionality from DuckDB's native
44

55
## Overview
66

7-
`parser_tools` is a DuckDB extension designed to provide SQL parsing capabilities within the database. It allows you to analyze SQL queries and extract structural information directly in SQL. Currently, it includes a single table function: `parse_tables`, which extracts table references from a given SQL query. Future versions may expose additional aspects of the parsed query structure.
7+
`parser_tools` is a DuckDB extension designed to provide SQL parsing capabilities within the database. It allows you to analyze SQL queries and extract structural information directly in SQL. This extension provides one table function and two scalar functions for parsing SQL and extracting referenced tables: `parse_tables` (table function and scalar function), and `parse_table_names` (see [Functions](#functions) below). Future versions may expose additional aspects of the parsed query structure.
88

99
## Features
1010

@@ -14,6 +14,11 @@ An experimental DuckDB extension that exposes functionality from DuckDB's native
1414
- Built on DuckDB's native SQL parser
1515
- Simple SQL interface — no external tooling required
1616

17+
18+
## Known Limitations
19+
- Only `SELECT` statements are supported
20+
- Only returns table references (the full parse tree is not exposed)
21+
1722
## Installation
1823

1924
```sql
@@ -64,23 +69,106 @@ This tells us a few things:
6469
* The `Users` table was referenced in a from clause.
6570
* `EarlyAdopters` was referenced in a from clause (but it's a cte, not a table).
6671

67-
## Function Reference
72+
## Context
73+
Context helps give context of where the table was used in the query:
74+
- `from`: table in the main `FROM` clause
75+
- `join_left`: left side of a `JOIN`
76+
- `join_right`: right side of a `JOIN`
77+
- `cte`: a Common Table Expression being defined
78+
- `from_cte`: usage of a CTE as if it were a table
79+
- `subquery`: table reference inside a subquery
80+
81+
## Functions
82+
83+
This extension provides one table function and two scalar functions for parsing SQL and extracting referenced tables.
84+
85+
### `parse_tables(sql_query)` – Table Function
86+
87+
Parses a SQL `SELECT` query and returns all referenced tables along with their context of use (e.g. `from`, `join_left`, `cte`, etc.).
88+
89+
#### Usage
90+
```sql
91+
SELECT * FROM parse_tables('SELECT * FROM my_table JOIN other_table USING (id)');
92+
```
93+
94+
#### Returns
95+
A table with:
96+
- `schema`: schema name (default `"main"` if unspecified)
97+
- `table`: table name
98+
- `context`: where the table appears in the query
99+
One of: `from`, `join_left`, `join_right`, `from_cte`, `cte`, `subquery`
100+
101+
#### Example
102+
```sql
103+
SELECT * FROM parse_tables($$
104+
WITH cte1 AS (SELECT * FROM x)
105+
SELECT * FROM cte1 JOIN y ON cte1.id = y.id
106+
$$);
107+
```
108+
109+
| schema | table | context |
110+
|--------|--------|------------|
111+
| | cte1 | cte |
112+
| main | x | from |
113+
| main | y | join_right |
114+
| | cte1 | from_cte |
115+
116+
---
117+
118+
### `parse_table_names(sql_query [, exclude_cte=true])` – Scalar Function
119+
120+
Returns a list of table names (strings) referenced in the SQL query. Can optionally exclude CTE-related references.
68121

69-
### `parse_tables(query TEXT) → TABLE(schema TEXT, table TEXT, context TEXT)`
122+
#### Usage
123+
```sql
124+
SELECT parse_table_names('SELECT * FROM my_table');
125+
----
126+
['my_table']
127+
```
128+
129+
#### Optional Parameter
130+
```sql
131+
SELECT parse_table_names('with cte_test as(select 1) select * from MyTable, cte_test', false); -- include CTEs
132+
----
133+
[cte_test, MyTable, cte_test]
134+
```
135+
136+
#### Returns
137+
A list of strings, each being a table name.
138+
139+
#### Example
140+
```sql
141+
SELECT parse_table_names('SELECT * FROM a JOIN b USING (id)');
142+
----
143+
['a', 'b']
144+
```
70145

71-
Parses the given SQL query and returns a list of all referenced tables along with:
146+
---
72147

73-
- `schema`: The schema name (e.g., `main`)
74-
- `table`: The table name
75-
- `context`: Where in the query the table is used. Possible values include:
76-
* from: The table appears in the FROM clause
77-
* joinleft: The table is on the left side of a JOIN
78-
* joinright: The table is on the right side of a JOIN
79-
* fromcte: The table appears in the FROM clause, but is a reference to a Common Table Expression (CTE)
80-
* `with US_Sales()
81-
* cte: The table is defined as a CTE
82-
* subquery: The table is used inside a subquery
148+
### `parse_tables(sql_query)` – Scalar Function (Structured)
83149

150+
Similar to the table function, but returns a **list of structs** instead of a result table. Each struct contains:
151+
152+
- `schema` (VARCHAR)
153+
- `table` (VARCHAR)
154+
- `context` (VARCHAR)
155+
156+
#### Usage
157+
```sql
158+
SELECT parse_tables('select * from MyTable');
159+
----
160+
[{'schema': main, 'table': MyTable, 'context': from}]
161+
```
162+
163+
#### Returns
164+
A list of STRUCTs with schema, table name, and context.
165+
166+
#### Example
167+
```sql
168+
SELECT parse_tables('select * from MyTable t inner join Other o on o.id = t.id');
169+
----
170+
[{'schema': main, 'table': MyTable, 'context': from}, {'schema': main, 'table': Other, 'context': join_right}]
171+
```
84172

85173
## Development
86174

0 commit comments

Comments
 (0)