Skip to content

Commit 734ef85

Browse files
h3n4lclaude
andauthored
feat(postgresql): implement automated keyword management and grammar improvements (#40)
* feat(postgresql): implement automated keyword management system Implemented a keyword generator that fetches PostgreSQL keywords from the official kwlist.h file and automatically generates ANTLR grammar rules. Key features: - Fetches keywords from PostgreSQL REL_18_STABLE kwlist.h - Generates parser rules (reserved, unreserved, col_name, type_func_name) - Inserts lexer keyword rules directly into PostgreSQLLexer.g4 - Handles ANTLR reserved name conflicts (e.g., SKIP → SKIP_P) - Matches PostgreSQL's official gram.y structure Changes: - Add keyword-generator/ tool (Go program) - Add PostgreSQLKeywords.g4 (auto-generated parser rules) - Update PostgreSQLLexer.g4 with auto-generated keyword section - Update PostgreSQLParser.g4 to use kwlist.h token names - Update Makefile with generate-keywords target - Fix token references in postgresql_lexer_base.go The generator ensures keywords are placed before the Identifier rule in the lexer for correct matching precedence. Keywords are managed between marker comments for easy regeneration. Test results: 190/214 tests passing (88.8%) 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * refactor(postgresql): remove non-standard builtin_function_name tokens Align with PostgreSQL's official grammar and ANTLR community standard by removing built-in function name tokens from the lexer. These function names (exp, div, floor, mod, power, sqrt, log, etc.) are now properly treated as regular identifiers, matching PostgreSQL's behavior. Changes: - Remove ~475 built-in function tokens from PostgreSQLLexer.g4 (ABS, CBRT, CEIL, EXP, DIV, FLOOR, MOD, POWER, SQRT, etc.) - Remove builtin_function_name parser rule from PostgreSQLParser.g4 - Update parser rules: param_name, func_type, generictype, func_name - Add LOG to plsql_unreserved_keyword (was missing, caused test failures) - Add .vscode/ to .gitignore Benefits: - Matches PostgreSQL's official grammar behavior - Aligns with ANTLR grammars-v4 community standard - Simpler, more maintainable grammar - Function names can naturally be used as identifiers Test Results: - All target tests pass: exp, div, floor, mod, power, sqrt, log - No new regressions in full test suite 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * fix(postgresql): add jsontype rule and fix REVERSE keyword Fix test failures by adding missing JsonType support and making REVERSE available as a function name. Changes: 1. Add jsontype rule matching PostgreSQL's gram.y: - jsontype: JSON - Added to simpletypename (for general type references) - Added to consttypename (for constant type references) 2. Add REVERSE to plsql_unreserved_keyword: - REVERSE is a PL/pgSQL token but should be usable as a function name - Now matches type_function_name via plsql_unreserved_keyword path Root Cause Analysis: - PostgreSQL's grammar has explicit JsonType rule for JSON keyword - Our grammar was missing this, causing ::json typecast to fail - JSON is a col_name_keyword, not in type_function_name categories - Without jsontype rule, parser tried qualified_name%TYPE_P path - REVERSE was in old builtin_function_name rule (now removed) - After removal, REVERSE token couldn't be used as function name - Solution: Add to plsql_unreserved_keyword to allow SQL usage Test Results: ✅ All previously failing tests now pass: - json_encoding.sql - json.sql - jsonb.sql - join_hash.sql - builtin_functions_string.sql - text.sql ✅ Full test suite passes with no regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * chore: update keyword generation timestamp * chore: remove one-time cleanup script and update README Remove cleanup_duplicates.sh as it was a one-time migration script no longer needed. Update README to reflect the current implementation without setup steps. * chore: update --------- Co-authored-by: Claude <[email protected]>
1 parent 635e973 commit 734ef85

15 files changed

+28250
-30088
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,4 +47,5 @@ go.work.sum
4747
**/*.class
4848

4949
# No binary files
50-
**/bin/**
50+
**/bin/**
51+
.vscode/

postgresql/Makefile

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,17 @@
11
all: build test
22

3-
build:
3+
# Generate keyword definitions from PostgreSQL official source
4+
generate-keywords:
5+
@echo "Generating PostgreSQL keyword definitions from REL_18_STABLE..."
6+
cd keyword-generator && go run main.go
7+
@echo ""
8+
9+
# Build the parser (depends on keyword generation)
10+
build: generate-keywords
11+
@echo "Building PostgreSQL parser..."
412
antlr -Dlanguage=Go -package postgresql -visitor -o . PostgreSQLLexer.g4 PostgreSQLParser.g4
513

6-
test:
7-
go test -v -run TestPostgreSQLParser
14+
test:
15+
go test -v -run TestPostgreSQLParser
16+
17+
.PHONY: all build test generate-keywords

0 commit comments

Comments
 (0)