Skip to content

Commit 890e457

Browse files
committed
feature: create wrapper over PgQuery Parser parse method result
1 parent 37a87c6 commit 890e457

23 files changed

+2638
-131
lines changed

documentation/components/libs/pg-query.md

Lines changed: 121 additions & 115 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,6 @@
44

55
PostgreSQL Query Parser library provides strongly-typed AST (Abstract Syntax Tree) parsing for PostgreSQL SQL queries using the [libpg_query](https://github.com/pganalyze/libpg_query) library through a PHP extension.
66

7-
This library wraps the low-level extension functions and provides:
8-
- Strongly-typed AST nodes generated from protobuf definitions
9-
- A `Parser` class for object-oriented access
10-
- DSL helper functions for convenient usage
11-
127
## Requirements
138

149
This library requires the `pg_query` PHP extension. See [pg-query-ext documentation](/documentation/components/extensions/pg-query-ext.md) for installation instructions.
@@ -19,178 +14,189 @@ This library requires the `pg_query` PHP extension. See [pg-query-ext documentat
1914
composer require flow-php/pg-query:~--FLOW_PHP_VERSION--
2015
```
2116

22-
## Usage
23-
24-
### Using the Parser Class
17+
## Quick Start
2518

2619
```php
2720
<?php
2821

29-
use Flow\PgQuery\Parser;
22+
use function Flow\PgQuery\DSL\pg_parse;
3023

31-
$parser = new Parser();
24+
$query = pg_parse('SELECT u.id, u.name FROM users u JOIN orders o ON u.id = o.user_id');
3225

33-
// Parse SQL into AST
34-
$result = $parser->parse('SELECT id, name FROM users WHERE active = true');
26+
// Get all tables
27+
foreach ($query->tables() as $table) {
28+
echo $table->name(); // 'users', 'orders'
29+
echo $table->alias(); // 'u', 'o'
30+
}
3531

36-
// Access the AST
37-
foreach ($result->getStmts() as $stmt) {
38-
$node = $stmt->getStmt();
39-
$selectStmt = $node->getSelectStmt();
40-
// Work with strongly-typed AST nodes...
32+
// Get all columns
33+
foreach ($query->columns() as $column) {
34+
echo $column->name(); // 'id', 'name', 'id', 'user_id'
35+
echo $column->table(); // 'u', 'u', 'u', 'o'
36+
}
37+
38+
// Get columns for specific table
39+
$userColumns = $query->columns('u');
40+
41+
// Get all function calls
42+
foreach ($query->functions() as $func) {
43+
echo $func->name(); // function name
44+
echo $func->schema(); // schema if qualified (e.g., 'pg_catalog')
4145
}
4246
```
4347

44-
### Using DSL Functions
48+
## Parser Class
4549

4650
```php
4751
<?php
4852

49-
use function Flow\PgQuery\DSL\pg_parse;
50-
use function Flow\PgQuery\DSL\pg_parser;
51-
use function Flow\PgQuery\DSL\pg_fingerprint;
52-
use function Flow\PgQuery\DSL\pg_normalize;
53-
use function Flow\PgQuery\DSL\pg_split;
53+
use Flow\PgQuery\Parser;
5454

55-
// Parse SQL
56-
$result = pg_parse('SELECT * FROM users');
55+
$parser = new Parser();
5756

58-
// Get a reusable parser instance
59-
$parser = pg_parser();
57+
// Parse SQL into ParsedQuery
58+
$query = $parser->parse('SELECT * FROM users WHERE id = 1');
6059

61-
// Generate fingerprint
62-
$fingerprint = pg_fingerprint('SELECT id FROM users WHERE id = 1');
60+
// Generate fingerprint (same for structurally equivalent queries)
61+
$fingerprint = $parser->fingerprint('SELECT * FROM users WHERE id = 1');
6362

64-
// Normalize query
65-
$normalized = pg_normalize('SELECT * FROM users WHERE id = 1');
63+
// Normalize query (replace literals with positional parameters)
64+
$normalized = $parser->normalize("SELECT * FROM users WHERE name = 'John'");
65+
// Returns: SELECT * FROM users WHERE name = $1
66+
67+
// Normalize also handles Doctrine-style named parameters
68+
$normalized = $parser->normalize('SELECT * FROM users WHERE id = :id');
69+
// Returns: SELECT * FROM users WHERE id = $1
6670

6771
// Split multiple statements
72+
$statements = $parser->split('SELECT 1; SELECT 2;');
73+
// Returns: ['SELECT 1', ' SELECT 2']
74+
```
75+
76+
## DSL Functions
77+
78+
```php
79+
<?php
80+
81+
use function Flow\PgQuery\DSL\{pg_parse, pg_parser, pg_fingerprint, pg_normalize, pg_split};
82+
83+
$query = pg_parse('SELECT * FROM users');
84+
$parser = pg_parser();
85+
$fingerprint = pg_fingerprint('SELECT * FROM users WHERE id = 1');
86+
$normalized = pg_normalize('SELECT * FROM users WHERE id = 1');
6887
$statements = pg_split('SELECT 1; SELECT 2;');
6988
```
7089

71-
## Features
90+
## ParsedQuery Methods
91+
92+
| Method | Description | Returns |
93+
|--------|-------------|---------|
94+
| `tables()` | Get all tables referenced in the query | `array<Table>` |
95+
| `columns(?string $tableName)` | Get columns, optionally filtered by table/alias | `array<Column>` |
96+
| `functions()` | Get all function calls | `array<FunctionCall>` |
97+
| `traverse(NodeVisitor ...$visitors)` | Traverse AST with custom visitors | `void` |
98+
| `raw()` | Access underlying protobuf ParseResult | `ParseResult` |
7299

73-
### Query Parsing
100+
## Custom AST Traversal
74101

75-
Parse PostgreSQL SQL into a strongly-typed AST:
102+
For advanced use cases, you can traverse the AST with custom visitors:
76103

77104
```php
78105
<?php
79106

80-
use Flow\PgQuery\Parser;
107+
use Flow\PgQuery\AST\NodeVisitor;
108+
use Flow\PgQuery\Protobuf\AST\ColumnRef;
81109

82-
$parser = new Parser();
83-
$result = $parser->parse('SELECT id, name FROM users WHERE active = true ORDER BY name');
110+
use function Flow\PgQuery\DSL\pg_parse;
84111

85-
foreach ($result->getStmts() as $stmt) {
86-
$selectStmt = $stmt->getStmt()->getSelectStmt();
112+
class ColumnCounter implements NodeVisitor
113+
{
114+
public int $count = 0;
87115

88-
// Access FROM clause
89-
foreach ($selectStmt->getFromClause() as $fromItem) {
90-
$rangeVar = $fromItem->getRangeVar();
91-
echo "Table: " . $rangeVar->getRelname() . "\n";
116+
public static function nodeClass(): string
117+
{
118+
return ColumnRef::class;
119+
}
120+
121+
public function enter(object $node): ?int
122+
{
123+
$this->count++;
124+
return null;
92125
}
93126

94-
// Access target list (SELECT columns)
95-
foreach ($selectStmt->getTargetList() as $target) {
96-
$columnRef = $target->getResTarget()->getVal()->getColumnRef();
97-
// Process column references...
127+
public function leave(object $node): ?int
128+
{
129+
return null;
98130
}
99131
}
100-
```
101132

102-
### Query Fingerprinting
133+
$query = pg_parse('SELECT id, name, email FROM users');
103134

104-
Generate unique fingerprints for structurally equivalent queries. This is useful for grouping similar queries regardless of their literal values:
135+
$counter = new ColumnCounter();
136+
$query->traverse($counter);
105137

106-
```php
107-
<?php
138+
echo $counter->count; // 3
139+
```
108140

109-
use Flow\PgQuery\Parser;
141+
### NodeVisitor Interface
110142

111-
$parser = new Parser();
143+
```php
144+
interface NodeVisitor
145+
{
146+
public const DONT_TRAVERSE_CHILDREN = 1;
147+
public const STOP_TRAVERSAL = 2;
112148

113-
// These queries produce the same fingerprint
114-
$fp1 = $parser->fingerprint('SELECT * FROM users WHERE id = 1');
115-
$fp2 = $parser->fingerprint('SELECT * FROM users WHERE id = 999');
149+
/** @return class-string */
150+
public static function nodeClass(): string;
116151

117-
var_dump($fp1 === $fp2); // true
152+
public function enter(object $node): ?int;
153+
public function leave(object $node): ?int;
154+
}
118155
```
119156

120-
### Query Normalization
121-
122-
Replace literal values with parameter placeholders:
123-
124-
```php
125-
<?php
126-
127-
use Flow\PgQuery\Parser;
157+
Visitors declare which node type they handle via `nodeClass()`. Return values:
158+
- `null` - continue traversal
159+
- `DONT_TRAVERSE_CHILDREN` - skip children (from `enter()` only)
160+
- `STOP_TRAVERSAL` - stop entire traversal
128161

129-
$parser = new Parser();
162+
### Built-in Visitors
130163

131-
$normalized = $parser->normalize("SELECT * FROM users WHERE name = 'John' AND age = 25");
132-
// Returns: SELECT * FROM users WHERE name = $1 AND age = $2
133-
```
164+
- `ColumnRefCollector` - collects all `ColumnRef` nodes
165+
- `FuncCallCollector` - collects all `FuncCall` nodes
166+
- `RangeVarCollector` - collects all `RangeVar` nodes
134167

135-
### Statement Splitting
168+
## Raw AST Access
136169

137-
Split a string containing multiple SQL statements:
170+
For full control, access the protobuf AST directly:
138171

139172
```php
140173
<?php
141174

142-
use Flow\PgQuery\Parser;
175+
use function Flow\PgQuery\DSL\pg_parse;
143176

144-
$parser = new Parser();
177+
$query = pg_parse('SELECT id FROM users WHERE active = true');
145178

146-
$statements = $parser->split('SELECT 1; SELECT 2; SELECT 3');
147-
// Returns: ['SELECT 1', ' SELECT 2', ' SELECT 3']
148-
```
179+
foreach ($query->raw()->getStmts() as $stmt) {
180+
$select = $stmt->getStmt()->getSelectStmt();
149181

150-
## API Reference
151-
152-
### Parser Class
182+
// Access FROM clause
183+
foreach ($select->getFromClause() as $from) {
184+
echo $from->getRangeVar()->getRelname();
185+
}
153186

154-
| Method | Description | Returns |
155-
|--------|-------------|---------|
156-
| `parse(string $sql)` | Parse SQL into AST | `ParseResult` |
157-
| `fingerprint(string $sql)` | Generate query fingerprint | `?string` |
158-
| `normalize(string $sql)` | Normalize query with placeholders | `?string` |
159-
| `split(string $sql)` | Split multiple statements | `array<string>` |
160-
161-
### DSL Functions
162-
163-
| Function | Description | Returns |
164-
|----------|-------------|---------|
165-
| `pg_parser()` | Create a new Parser instance | `Parser` |
166-
| `pg_parse(string $sql)` | Parse SQL into AST | `ParseResult` |
167-
| `pg_fingerprint(string $sql)` | Generate query fingerprint | `?string` |
168-
| `pg_normalize(string $sql)` | Normalize query | `?string` |
169-
| `pg_split(string $sql)` | Split statements | `array<string>` |
170-
171-
## AST Node Types
172-
173-
The library includes 343 strongly-typed AST node classes generated from PostgreSQL's protobuf definitions. All classes are in the `Flow\PgQuery\Protobuf\AST` namespace.
174-
175-
Common node types include:
176-
- `SelectStmt` - SELECT statement
177-
- `InsertStmt` - INSERT statement
178-
- `UpdateStmt` - UPDATE statement
179-
- `DeleteStmt` - DELETE statement
180-
- `ColumnRef` - Column reference
181-
- `A_Expr` - Expression node
182-
- `FuncCall` - Function call
183-
- `JoinExpr` - JOIN expression
184-
- `RangeVar` - Table/view reference
187+
// Access WHERE clause
188+
$where = $select->getWhereClause();
189+
// ...
190+
}
191+
```
185192

186193
## Exception Handling
187194

188195
```php
189196
<?php
190197

191198
use Flow\PgQuery\Parser;
192-
use Flow\PgQuery\Exception\ParserException;
193-
use Flow\PgQuery\Exception\ExtensionNotLoadedException;
199+
use Flow\PgQuery\Exception\{ParserException, ExtensionNotLoadedException};
194200

195201
try {
196202
$parser = new Parser();
@@ -199,7 +205,7 @@ try {
199205
}
200206

201207
try {
202-
$result = $parser->parse('INVALID SQL SYNTAX HERE');
208+
$parser->parse('INVALID SQL');
203209
} catch (ParserException $e) {
204210
echo "Parse error: " . $e->getMessage();
205211
}
@@ -213,4 +219,4 @@ For optimal protobuf parsing performance, install the `ext-protobuf` PHP extensi
213219
pecl install protobuf
214220
```
215221

216-
The library will work without it using the pure PHP implementation from `google/protobuf`, but the native extension provides significantly better performance for AST deserialization.
222+
The library works without it using the pure PHP implementation from `google/protobuf`, but the native extension provides significantly better performance.
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
namespace Flow\PgQuery\AST;
6+
7+
/**
8+
* Interface for AST node visitors.
9+
*
10+
* Visitors are registered for specific node types and only receive nodes of that type.
11+
* Use the static nodeClass() method to declare which node type this visitor handles.
12+
*/
13+
interface NodeVisitor
14+
{
15+
/**
16+
* Don't traverse children of the current node.
17+
*/
18+
public const DONT_TRAVERSE_CHILDREN = 1;
19+
20+
/**
21+
* Remove the node from its parent array.
22+
*/
23+
public const REMOVE_NODE = 3;
24+
25+
/**
26+
* Stop the entire traversal.
27+
*/
28+
public const STOP_TRAVERSAL = 2;
29+
30+
/**
31+
* Returns the fully qualified class name of the node type this visitor handles.
32+
*
33+
* @return class-string The node class this visitor is registered for
34+
*/
35+
public static function nodeClass() : string;
36+
37+
/**
38+
* Called when entering a node of the registered type.
39+
*
40+
* @param object $node The node instance (type depends on nodeClass())
41+
*
42+
* @return null|int Return value determines traversal behavior:
43+
* - null: Continue traversal
44+
* - DONT_TRAVERSE_CHILDREN: Don't traverse children
45+
* - STOP_TRAVERSAL: Stop entire traversal
46+
*/
47+
public function enter(object $node) : ?int;
48+
49+
/**
50+
* Called when leaving a node of the registered type.
51+
*
52+
* @param object $node The node instance (type depends on nodeClass())
53+
*
54+
* @return null|int Return value determines traversal behavior:
55+
* - null: Continue traversal
56+
* - REMOVE_NODE: Remove node from parent
57+
* - STOP_TRAVERSAL: Stop entire traversal
58+
*/
59+
public function leave(object $node) : ?int;
60+
}

0 commit comments

Comments
 (0)