Skip to content

Conversation

@h3n4l
Copy link
Member

@h3n4l h3n4l commented Aug 25, 2025

Summary

Adds a Go-based ANTLR v4 grammar parser that can parse all .g4 grammar files in the repository. This serves as the foundation for future grammar-based fuzzing tools and grammar validation.

Changes

Core Implementation

  • tools/grammar/ - New Go ANTLR v4 grammar parser
    • ANTLRv4Lexer.g4, ANTLRv4Parser.g4 - ANTLR v4 grammars from upstream
    • lexer_adaptor.go - Custom lexer with context-sensitive parsing
    • parser_test.go - Comprehensive test suite
    • Makefile - Build automation with auto-fixing
    • README.md - Documentation

CI Integration

  • .github/workflows/tests.yml - Added antlr-grammar-tests job
    • Triggers on .g4 file changes or tools/grammar/ changes
    • Uses same test pattern as existing go-tests job
    • Integrated into all-tests-passed requirement

Key Features

100% Success Rate - Parses all 12 grammar files in repository:

  • PostgreSQL (2 files)
  • CQL (2 files)
  • Redshift (2 files)
  • ANTLR v4 (6 files across variants)

Context-Sensitive Parsing - Handles complex ANTLR constructs:

  • Converts ID tokens to TOKEN_REF/RULE_REF based on case
  • Distinguishes [charset] vs [actions] based on context
  • Tracks lexer/parser rule states

Automated Workflow - Smart CI integration:

  • Only runs when grammar files change
  • Same performance reporting as other tests
  • Blocks PRs if grammar parsing fails

Technical Details

Go ANTLR Challenges Solved

  1. Immutable tokens - Used TokenTypeWrapper to override token types
  2. No auto-emit - Override NextToken() instead of Emit()
  3. Constructor integration - Automated sed fix for generated code

Source Attribution

Usage

cd tools/grammar
make build  # Generate parser and apply fixes
make test   # Test all .g4 files (100% success)
make all    # Build and test

Future Use Cases

This parser enables:

  • Grammar-based SQL fuzzing
  • Grammar validation in CI
  • Grammar analysis and tooling
  • Parser generation for new dialects

@h3n4l h3n4l merged commit 0e7bac9 into main Aug 25, 2025
5 checks passed
@h3n4l h3n4l deleted the h-branch-2 branch August 25, 2025 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants