44
55PostgreSQL Query Parser library provides strongly-typed AST (Abstract Syntax Tree) parsing for PostgreSQL SQL queries using the [ libpg_query] ( https://github.com/pganalyze/libpg_query ) library through a PHP extension.
66
7- This library wraps the low-level extension functions and provides:
8- - Strongly-typed AST nodes generated from protobuf definitions
9- - A ` Parser ` class for object-oriented access
10- - DSL helper functions for convenient usage
11-
127## Requirements
138
149This library requires the ` pg_query ` PHP extension. See [ pg-query-ext documentation] ( /documentation/components/extensions/pg-query-ext.md ) for installation instructions.
@@ -19,178 +14,189 @@ This library requires the `pg_query` PHP extension. See [pg-query-ext documentat
1914composer require flow-php/pg-query:~--FLOW_PHP_VERSION--
2015```
2116
22- ## Usage
23-
24- ### Using the Parser Class
17+ ## Quick Start
2518
2619``` php
2720<?php
2821
29- use Flow\PgQuery\Parser ;
22+ use function Flow\PgQuery\DSL\pg_parse ;
3023
31- $parser = new Parser( );
24+ $query = pg_parse('SELECT u.id, u.name FROM users u JOIN orders o ON u.id = o.user_id' );
3225
33- // Parse SQL into AST
34- $result = $parser->parse('SELECT id, name FROM users WHERE active = true');
26+ // Get all tables
27+ foreach ($query->tables() as $table) {
28+ echo $table->name(); // 'users', 'orders'
29+ echo $table->alias(); // 'u', 'o'
30+ }
3531
36- // Access the AST
37- foreach ($result->getStmts() as $stmt) {
38- $node = $stmt->getStmt();
39- $selectStmt = $node->getSelectStmt();
40- // Work with strongly-typed AST nodes...
32+ // Get all columns
33+ foreach ($query->columns() as $column) {
34+ echo $column->name(); // 'id', 'name', 'id', 'user_id'
35+ echo $column->table(); // 'u', 'u', 'u', 'o'
36+ }
37+
38+ // Get columns for specific table
39+ $userColumns = $query->columns('u');
40+
41+ // Get all function calls
42+ foreach ($query->functions() as $func) {
43+ echo $func->name(); // function name
44+ echo $func->schema(); // schema if qualified (e.g., 'pg_catalog')
4145}
4246```
4347
44- ### Using DSL Functions
48+ ## Parser Class
4549
4650``` php
4751<?php
4852
49- use function Flow\PgQuery\DSL\pg_parse;
50- use function Flow\PgQuery\DSL\pg_parser;
51- use function Flow\PgQuery\DSL\pg_fingerprint;
52- use function Flow\PgQuery\DSL\pg_normalize;
53- use function Flow\PgQuery\DSL\pg_split;
53+ use Flow\PgQuery\Parser;
5454
55- // Parse SQL
56- $result = pg_parse('SELECT * FROM users');
55+ $parser = new Parser();
5756
58- // Get a reusable parser instance
59- $parser = pg_parser( );
57+ // Parse SQL into ParsedQuery
58+ $query = $parser->parse('SELECT * FROM users WHERE id = 1' );
6059
61- // Generate fingerprint
62- $fingerprint = pg_fingerprint ('SELECT id FROM users WHERE id = 1');
60+ // Generate fingerprint (same for structurally equivalent queries)
61+ $fingerprint = $parser->fingerprint ('SELECT * FROM users WHERE id = 1');
6362
64- // Normalize query
65- $normalized = pg_normalize('SELECT * FROM users WHERE id = 1');
63+ // Normalize query (replace literals with positional parameters)
64+ $normalized = $parser->normalize("SELECT * FROM users WHERE name = 'John'");
65+ // Returns: SELECT * FROM users WHERE name = $1
66+
67+ // Normalize also handles Doctrine-style named parameters
68+ $normalized = $parser->normalize('SELECT * FROM users WHERE id = :id');
69+ // Returns: SELECT * FROM users WHERE id = $1
6670
6771// Split multiple statements
72+ $statements = $parser->split('SELECT 1; SELECT 2;');
73+ // Returns: ['SELECT 1', ' SELECT 2']
74+ ```
75+
76+ ## DSL Functions
77+
78+ ``` php
79+ <?php
80+
81+ use function Flow\PgQuery\DSL\{pg_parse, pg_parser, pg_fingerprint, pg_normalize, pg_split};
82+
83+ $query = pg_parse('SELECT * FROM users');
84+ $parser = pg_parser();
85+ $fingerprint = pg_fingerprint('SELECT * FROM users WHERE id = 1');
86+ $normalized = pg_normalize('SELECT * FROM users WHERE id = 1');
6887$statements = pg_split('SELECT 1; SELECT 2;');
6988```
7089
71- ## Features
90+ ## ParsedQuery Methods
91+
92+ | Method | Description | Returns |
93+ | --------| -------------| ---------|
94+ | ` tables() ` | Get all tables referenced in the query | ` array<Table> ` |
95+ | ` columns(?string $tableName) ` | Get columns, optionally filtered by table/alias | ` array<Column> ` |
96+ | ` functions() ` | Get all function calls | ` array<FunctionCall> ` |
97+ | ` traverse(NodeVisitor ...$visitors) ` | Traverse AST with custom visitors | ` void ` |
98+ | ` raw() ` | Access underlying protobuf ParseResult | ` ParseResult ` |
7299
73- ### Query Parsing
100+ ## Custom AST Traversal
74101
75- Parse PostgreSQL SQL into a strongly-typed AST:
102+ For advanced use cases, you can traverse the AST with custom visitors :
76103
77104``` php
78105<?php
79106
80- use Flow\PgQuery\Parser;
107+ use Flow\PgQuery\AST\NodeVisitor;
108+ use Flow\PgQuery\Protobuf\AST\ColumnRef;
81109
82- $parser = new Parser();
83- $result = $parser->parse('SELECT id, name FROM users WHERE active = true ORDER BY name');
110+ use function Flow\PgQuery\DSL\pg_parse;
84111
85- foreach ($result->getStmts() as $stmt) {
86- $selectStmt = $stmt->getStmt()->getSelectStmt();
112+ class ColumnCounter implements NodeVisitor
113+ {
114+ public int $count = 0;
87115
88- // Access FROM clause
89- foreach ($selectStmt->getFromClause() as $fromItem) {
90- $rangeVar = $fromItem->getRangeVar();
91- echo "Table: " . $rangeVar->getRelname() . "\n";
116+ public static function nodeClass(): string
117+ {
118+ return ColumnRef::class;
119+ }
120+
121+ public function enter(object $node): ?int
122+ {
123+ $this->count++;
124+ return null;
92125 }
93126
94- // Access target list (SELECT columns)
95- foreach ($selectStmt->getTargetList() as $target) {
96- $columnRef = $target->getResTarget()->getVal()->getColumnRef();
97- // Process column references...
127+ public function leave(object $node): ?int
128+ {
129+ return null;
98130 }
99131}
100- ```
101132
102- ### Query Fingerprinting
133+ $query = pg_parse('SELECT id, name, email FROM users');
103134
104- Generate unique fingerprints for structurally equivalent queries. This is useful for grouping similar queries regardless of their literal values:
135+ $counter = new ColumnCounter();
136+ $query->traverse($counter);
105137
106- ``` php
107- <?php
138+ echo $counter->count; // 3
139+ ```
108140
109- use Flow\PgQuery\Parser;
141+ ### NodeVisitor Interface
110142
111- $parser = new Parser();
143+ ``` php
144+ interface NodeVisitor
145+ {
146+ public const DONT_TRAVERSE_CHILDREN = 1;
147+ public const STOP_TRAVERSAL = 2;
112148
113- // These queries produce the same fingerprint
114- $fp1 = $parser->fingerprint('SELECT * FROM users WHERE id = 1');
115- $fp2 = $parser->fingerprint('SELECT * FROM users WHERE id = 999');
149+ /** @return class-string */
150+ public static function nodeClass(): string;
116151
117- var_dump($fp1 === $fp2); // true
152+ public function enter(object $node): ?int;
153+ public function leave(object $node): ?int;
154+ }
118155```
119156
120- ### Query Normalization
121-
122- Replace literal values with parameter placeholders:
123-
124- ``` php
125- <?php
126-
127- use Flow\PgQuery\Parser;
157+ Visitors declare which node type they handle via ` nodeClass() ` . Return values:
158+ - ` null ` - continue traversal
159+ - ` DONT_TRAVERSE_CHILDREN ` - skip children (from ` enter() ` only)
160+ - ` STOP_TRAVERSAL ` - stop entire traversal
128161
129- $parser = new Parser();
162+ ### Built-in Visitors
130163
131- $normalized = $parser->normalize("SELECT * FROM users WHERE name = 'John' AND age = 25");
132- // Returns: SELECT * FROM users WHERE name = $1 AND age = $2
133- ```
164+ - ` ColumnRefCollector ` - collects all ` ColumnRef ` nodes
165+ - ` FuncCallCollector ` - collects all ` FuncCall ` nodes
166+ - ` RangeVarCollector ` - collects all ` RangeVar ` nodes
134167
135- ### Statement Splitting
168+ ## Raw AST Access
136169
137- Split a string containing multiple SQL statements :
170+ For full control, access the protobuf AST directly :
138171
139172``` php
140173<?php
141174
142- use Flow\PgQuery\Parser ;
175+ use function Flow\PgQuery\DSL\pg_parse ;
143176
144- $parser = new Parser( );
177+ $query = pg_parse('SELECT id FROM users WHERE active = true' );
145178
146- $statements = $parser->split('SELECT 1; SELECT 2; SELECT 3');
147- // Returns: ['SELECT 1', ' SELECT 2', ' SELECT 3']
148- ```
179+ foreach ($query->raw()->getStmts() as $stmt) {
180+ $select = $stmt->getStmt()->getSelectStmt();
149181
150- ## API Reference
151-
152- ### Parser Class
182+ // Access FROM clause
183+ foreach ($select->getFromClause() as $from) {
184+ echo $from->getRangeVar()->getRelname();
185+ }
153186
154- | Method | Description | Returns |
155- | --------| -------------| ---------|
156- | ` parse(string $sql) ` | Parse SQL into AST | ` ParseResult ` |
157- | ` fingerprint(string $sql) ` | Generate query fingerprint | ` ?string ` |
158- | ` normalize(string $sql) ` | Normalize query with placeholders | ` ?string ` |
159- | ` split(string $sql) ` | Split multiple statements | ` array<string> ` |
160-
161- ### DSL Functions
162-
163- | Function | Description | Returns |
164- | ----------| -------------| ---------|
165- | ` pg_parser() ` | Create a new Parser instance | ` Parser ` |
166- | ` pg_parse(string $sql) ` | Parse SQL into AST | ` ParseResult ` |
167- | ` pg_fingerprint(string $sql) ` | Generate query fingerprint | ` ?string ` |
168- | ` pg_normalize(string $sql) ` | Normalize query | ` ?string ` |
169- | ` pg_split(string $sql) ` | Split statements | ` array<string> ` |
170-
171- ## AST Node Types
172-
173- The library includes 343 strongly-typed AST node classes generated from PostgreSQL's protobuf definitions. All classes are in the ` Flow\PgQuery\Protobuf\AST ` namespace.
174-
175- Common node types include:
176- - ` SelectStmt ` - SELECT statement
177- - ` InsertStmt ` - INSERT statement
178- - ` UpdateStmt ` - UPDATE statement
179- - ` DeleteStmt ` - DELETE statement
180- - ` ColumnRef ` - Column reference
181- - ` A_Expr ` - Expression node
182- - ` FuncCall ` - Function call
183- - ` JoinExpr ` - JOIN expression
184- - ` RangeVar ` - Table/view reference
187+ // Access WHERE clause
188+ $where = $select->getWhereClause();
189+ // ...
190+ }
191+ ```
185192
186193## Exception Handling
187194
188195``` php
189196<?php
190197
191198use Flow\PgQuery\Parser;
192- use Flow\PgQuery\Exception\ParserException;
193- use Flow\PgQuery\Exception\ExtensionNotLoadedException;
199+ use Flow\PgQuery\Exception\{ParserException, ExtensionNotLoadedException};
194200
195201try {
196202 $parser = new Parser();
@@ -199,7 +205,7 @@ try {
199205}
200206
201207try {
202- $result = $ parser->parse('INVALID SQL SYNTAX HERE ');
208+ $parser->parse('INVALID SQL');
203209} catch (ParserException $e) {
204210 echo "Parse error: " . $e->getMessage();
205211}
@@ -213,4 +219,4 @@ For optimal protobuf parsing performance, install the `ext-protobuf` PHP extensi
213219pecl install protobuf
214220```
215221
216- The library will work without it using the pure PHP implementation from ` google/protobuf ` , but the native extension provides significantly better performance for AST deserialization .
222+ The library works without it using the pure PHP implementation from ` google/protobuf ` , but the native extension provides significantly better performance.
0 commit comments