Commit 734ef85
feat(postgresql): implement automated keyword management and grammar improvements (#40)
* feat(postgresql): implement automated keyword management system
Implemented a keyword generator that fetches PostgreSQL keywords from the
official kwlist.h file and automatically generates ANTLR grammar rules.
Key features:
- Fetches keywords from PostgreSQL REL_18_STABLE kwlist.h
- Generates parser rules (reserved, unreserved, col_name, type_func_name)
- Inserts lexer keyword rules directly into PostgreSQLLexer.g4
- Handles ANTLR reserved name conflicts (e.g., SKIP → SKIP_P)
- Matches PostgreSQL's official gram.y structure
Changes:
- Add keyword-generator/ tool (Go program)
- Add PostgreSQLKeywords.g4 (auto-generated parser rules)
- Update PostgreSQLLexer.g4 with auto-generated keyword section
- Update PostgreSQLParser.g4 to use kwlist.h token names
- Update Makefile with generate-keywords target
- Fix token references in postgresql_lexer_base.go
The generator ensures keywords are placed before the Identifier rule in
the lexer for correct matching precedence. Keywords are managed between
marker comments for easy regeneration.
Test results: 190/214 tests passing (88.8%)
🤖 Generated with Claude Code (https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
* refactor(postgresql): remove non-standard builtin_function_name tokens
Align with PostgreSQL's official grammar and ANTLR community standard by
removing built-in function name tokens from the lexer. These function names
(exp, div, floor, mod, power, sqrt, log, etc.) are now properly treated as
regular identifiers, matching PostgreSQL's behavior.
Changes:
- Remove ~475 built-in function tokens from PostgreSQLLexer.g4
(ABS, CBRT, CEIL, EXP, DIV, FLOOR, MOD, POWER, SQRT, etc.)
- Remove builtin_function_name parser rule from PostgreSQLParser.g4
- Update parser rules: param_name, func_type, generictype, func_name
- Add LOG to plsql_unreserved_keyword (was missing, caused test failures)
- Add .vscode/ to .gitignore
Benefits:
- Matches PostgreSQL's official grammar behavior
- Aligns with ANTLR grammars-v4 community standard
- Simpler, more maintainable grammar
- Function names can naturally be used as identifiers
Test Results:
- All target tests pass: exp, div, floor, mod, power, sqrt, log
- No new regressions in full test suite
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
* fix(postgresql): add jsontype rule and fix REVERSE keyword
Fix test failures by adding missing JsonType support and making REVERSE
available as a function name.
Changes:
1. Add jsontype rule matching PostgreSQL's gram.y:
- jsontype: JSON
- Added to simpletypename (for general type references)
- Added to consttypename (for constant type references)
2. Add REVERSE to plsql_unreserved_keyword:
- REVERSE is a PL/pgSQL token but should be usable as a function name
- Now matches type_function_name via plsql_unreserved_keyword path
Root Cause Analysis:
- PostgreSQL's grammar has explicit JsonType rule for JSON keyword
- Our grammar was missing this, causing ::json typecast to fail
- JSON is a col_name_keyword, not in type_function_name categories
- Without jsontype rule, parser tried qualified_name%TYPE_P path
- REVERSE was in old builtin_function_name rule (now removed)
- After removal, REVERSE token couldn't be used as function name
- Solution: Add to plsql_unreserved_keyword to allow SQL usage
Test Results:
✅ All previously failing tests now pass:
- json_encoding.sql
- json.sql
- jsonb.sql
- join_hash.sql
- builtin_functions_string.sql
- text.sql
✅ Full test suite passes with no regressions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
* chore: update keyword generation timestamp
* chore: remove one-time cleanup script and update README
Remove cleanup_duplicates.sh as it was a one-time migration script no longer needed.
Update README to reflect the current implementation without setup steps.
* chore: update
---------
Co-authored-by: Claude <[email protected]>1 parent 635e973 commit 734ef85
File tree
15 files changed
+28250
-30088
lines changed- postgresql
- keyword-generator
15 files changed
+28250
-30088
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
| 51 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
4 | 12 | | |
5 | 13 | | |
6 | | - | |
7 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
0 commit comments