Skip to content

Commit c15a216

Browse files
committed
update readme
1 parent 9c6def5 commit c15a216

File tree

1 file changed

+111
-53
lines changed

1 file changed

+111
-53
lines changed

README.md

Lines changed: 111 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,144 @@
11
# Parser Tools
22

3-
This repository is based on https://github.com/duckdb/extension-template, check it out if you want to build and ship your own DuckDB extension.
3+
An experimental DuckDB extension that exposes functionality from DuckDB's native SQL parser.
44

5-
---
5+
## Overview
66

7-
This extension, ParseTables, allow you to ... <extension_goal>.
7+
`parser_tools` is a DuckDB extension designed to provide SQL parsing capabilities within the database. It allows you to analyze SQL queries and extract structural information directly in SQL. Currently, it includes a single table function: `parse_tables`, which extracts table references from a given SQL query. Future versions may expose additional aspects of the parsed query structure.
88

9+
## Features
910

10-
## Building
11-
### Managing dependencies
12-
DuckDB extensions uses VCPKG for dependency management. Enabling VCPKG is very simple: follow the [installation instructions](https://vcpkg.io/en/getting-started) or just run the following:
13-
```shell
14-
git clone https://github.com/Microsoft/vcpkg.git
15-
./vcpkg/bootstrap-vcpkg.sh
16-
export VCPKG_TOOLCHAIN_PATH=`pwd`/vcpkg/scripts/buildsystems/vcpkg.cmake
11+
- Extract table references from a SQL query
12+
- See the **context** in which each table is used (e.g. `FROM`, `JOIN`, etc.)
13+
- Includes **schema**, **table**, and **context** information
14+
- Built on DuckDB's native SQL parser
15+
- Simple SQL interface — no external tooling required
16+
17+
## Installation
18+
19+
```sql
20+
INSTALL 'parser_tools';
21+
LOAD 'parser_tools';
22+
```
23+
24+
## Usage
25+
26+
### Parse table references from a query
27+
#### Simple example
28+
29+
```sql
30+
SELECT * FROM parse_tables('SELECT * FROM MyTable');
31+
```
32+
33+
##### Output
34+
35+
```
36+
┌─────────┬─────────┬─────────┐
37+
│ schema │ table │ context │
38+
│ varchar │ varchar │ varchar │
39+
├─────────┼─────────┼─────────┤
40+
│ main │ MyTable │ from │
41+
└─────────┴─────────┴─────────┘
42+
```
43+
44+
This tells you that `MyTable` in the `main` schema was used in the `FROM` clause of the query.
45+
46+
#### CTE Example
47+
```sql
48+
select * from parse_tables('with EarlyAdopters as (select * from Users where id < 10) select * from EarlyAdopters;');
49+
```
50+
51+
##### Output
52+
```
53+
┌─────────┬───────────────┬──────────┐
54+
│ schema │ table │ context │
55+
│ varchar │ varchar │ varchar │
56+
├─────────┼───────────────┼──────────┤
57+
│ │ EarlyAdopters │ cte │
58+
│ main │ Users │ from │
59+
│ main │ EarlyAdopters │ from_cte │
60+
└─────────┴───────────────┴──────────┘
1761
```
18-
Note: VCPKG is only required for extensions that want to rely on it for dependency management. If you want to develop an extension without dependencies, or want to do your own dependency management, just skip this step. Note that the example extension uses VCPKG to build with a dependency for instructive purposes, so when skipping this step the build may not work without removing the dependency.
62+
This tells us a few things:
63+
* `EarlyAdopters` was defined as a CTE.
64+
* The `Users` table was referenced in a from clause.
65+
* `EarlyAdopters` was referenced in a from clause (but it's a cte, not a table).
66+
67+
## Function Reference
68+
69+
### `parse_tables(query TEXT) → TABLE(schema TEXT, table TEXT, context TEXT)`
70+
71+
Parses the given SQL query and returns a list of all referenced tables along with:
72+
73+
- `schema`: The schema name (e.g., `main`)
74+
- `table`: The table name
75+
- `context`: Where in the query the table is used. Possible values include:
76+
* from: The table appears in the FROM clause
77+
* joinleft: The table is on the left side of a JOIN
78+
* joinright: The table is on the right side of a JOIN
79+
* fromcte: The table appears in the FROM clause, but is a reference to a Common Table Expression (CTE)
80+
* `with US_Sales()
81+
* cte: The table is defined as a CTE
82+
* subquery: The table is used inside a subquery
83+
84+
85+
## Development
1986

2087
### Build steps
21-
Now to build the extension, run:
88+
To build the extension, run:
2289
```sh
23-
make
90+
GEN=ninja make
2491
```
2592
The main binaries that will be built are:
2693
```sh
2794
./build/release/duckdb
2895
./build/release/test/unittest
29-
./build/release/extension/parser/parser.duckdb_extension
96+
./build/release/extension/parser_tools/parser_tools.duckdb_extension
3097
```
3198
- `duckdb` is the binary for the duckdb shell with the extension code automatically loaded.
3299
- `unittest` is the test runner of duckdb. Again, the extension is already linked into the binary.
33-
- `parser.duckdb_extension` is the loadable binary as it would be distributed.
100+
- `parser_tools.duckdb_extension` is the loadable binary as it would be distributed.
34101

35102
## Running the extension
36-
To run the extension code, simply start the shell with `./build/release/duckdb`.
103+
To run the extension code, simply start the shell with `./build/release/duckdb` (which has the parser_tools extension built-in).
37104

38-
Now we can use the features from the extension directly in DuckDB. The template contains a single scalar function `parse_tables()` that takes a string arguments and returns a string:
105+
Now we can use the features from the extension directly in DuckDB:
39106
```
40-
D select parse_tables('Jane') as result;
41-
┌───────────────┐
42-
result
43-
varchar
44-
├───────────────┤
45-
ParseTables Jane 🐥
46-
└───────────────┘
107+
D select * from parse_tables('select * from MyTable');
108+
┌─────────┬─────────┬─────────┐
109+
schema │ table │ context
110+
varchar │ varchar │ varchar
111+
├─────────┼─────────┼─────────┤
112+
main │ MyTable │ from
113+
└─────────┴─────────┴─────────┘
47114
```
48115

49-
## Running the tests
50-
Different tests can be created for DuckDB extensions. The primary way of testing DuckDB extensions should be the SQL tests in `./test/sql`. These SQL tests can be run using:
51-
```sh
52-
make test
116+
## Running the extension from a duckdb distribution
117+
To run the extension dev build from an existing distribution of duckdb (e.g. cli):
53118
```
119+
$ duckdb -unsigned
54120
55-
### Installing the deployed binaries
56-
To install your extension binaries from S3, you will need to do two things. Firstly, DuckDB should be launched with the
57-
`allow_unsigned_extensions` option set to true. How to set this will depend on the client you're using. Some examples:
121+
D install parser_tools from './build/release/repository/v1.2.1/osx_amd64/parser_tools.duckdb_extension';
122+
D load parser_tools;
58123
59-
CLI:
60-
```shell
61-
duckdb -unsigned
124+
D select * from parse_tables('select * from MyTable');
125+
┌─────────┬─────────┬─────────┐
126+
│ schema │ table │ context │
127+
│ varchar │ varchar │ varchar │
128+
├─────────┼─────────┼─────────┤
129+
│ main │ MyTable │ from │
130+
└─────────┴─────────┴─────────┘
62131
```
63132

64-
Python:
65-
```python
66-
con = duckdb.connect(':memory:', config={'allow_unsigned_extensions' : 'true'})
67-
```
68-
69-
NodeJS:
70-
```js
71-
db = new duckdb.Database(':memory:', {"allow_unsigned_extensions": "true"});
72-
```
133+
## Running the tests
134+
See [Writing Tests](https://duckdb.org/docs/stable/dev/sqllogictest/writing_tests.html) to learn more about duckdb's testing philosophy. To that end, we define tests in sql at: [test/sql](test/sql/).
73135

74-
Secondly, you will need to set the repository endpoint in DuckDB to the HTTP url of your bucket + version of the extension
75-
you want to install. To do this run the following SQL query in DuckDB:
76-
```sql
77-
SET custom_extension_repository='bucket.s3.eu-west-1.amazonaws.com/<your_extension_name>/latest';
136+
The tests can be run with:
137+
```sh
138+
make test
78139
```
79-
Note that the `/latest` path will allow you to install the latest extension version available for your current version of
80-
DuckDB. To specify a specific version, you can pass the version instead.
81140

82-
After running these steps, you can install and load your extension using the regular INSTALL/LOAD commands in DuckDB:
83-
```sql
84-
INSTALL parse_tables
85-
LOAD parse_tables
141+
and easily re-ran as changes are made with:
142+
```sh
143+
GEN=ninja make && make test
86144
```

0 commit comments

Comments
 (0)