PCFG-based uncertainty quantification for programs. Extracts probabilistic context-free grammars from source code and computes entropy metrics.
pip install -e .Or with Docker:
docker compose buildfrom grammar_uncertainty import GrammarAnalyzer
result = GrammarAnalyzer(grammar="smtlib").analyze("/path/to/programs")
print(result.uncertainty.grammar_entropy, result.uncertainty.perplexity)# Setup grammar (downloads ANTLR grammar and generates parser)
grammar-uncertainty setup smtlib
# Analyze a folder of programs
grammar-uncertainty analyze ./programs --grammar smtlib --output results.json
# Start the API server
grammar-uncertainty serve --port 8000# Run CLI commands
docker compose run --rm grammar-uncertainty grammar-uncertainty analyze /data/samplesmtfolder --grammar smtlib
# Start API server
docker compose upRun uncertainty metric evaluation on SMT datasets:
# List available datasets
docker compose run --rm grammar-uncertainty python -m experiments list-datasets smtlogs
# Run analysis on a dataset
docker compose run --rm grammar-uncertainty python -m experiments run smtlogs/ProofWriter/03-mini/200-100/o3-mini-200-100See experiments/README.md for details.
| Grammar | Extensions | Setup |
|---|---|---|
| smtlib | .smt2, .smt | grammar-uncertainty setup smtlib |
| prolog | .pl, .prolog | grammar-uncertainty setup prolog |
Extensible to any ANTLR v4 grammar. Implement GrammarHandler in src/grammar_uncertainty/grammars/.
- Grammar Entropy: Shannon entropy of the PCFG
- Perplexity: Exponential of cross-entropy
- NSUI: Normalized Syntactic Uncertainty Index
- KL Divergence: Distance from uniform distribution
- Rényi Entropy: Generalized entropy (α-parameterized)
curl -X POST localhost:8000/analyze \
-H "Content-Type: application/json" \
-d '{"folder_path": "/data/programs", "grammar": "smtlib"}'| Endpoint | Method | Description |
|---|---|---|
| /analyze | POST | Analyze programs |
| /grammars | GET | List grammars |
| /setup | POST | Setup grammar |
| /health | GET | Health check |
@inproceedings{
ganguly2025grammars,
title={Grammars of Formal Uncertainty: When to Trust {LLM}s in Automated Reasoning Tasks},
author={Debargha Ganguly and Vikash Singh and Sreehari Sankar and Biyao Zhang and Xuecen Zhang and Srinivasan Iyengar and Xiaotian Han and Amit Sharma and Shivkumar Kalyanaraman and Vipin Chaudhary},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=QfKpJ00t2L}
}MIT