Forward: HCL2 Text → [PostLexer] → Lark Parse Tree → LarkElement Tree → Python Dict/JSON
Reverse: Python Dict/JSON → LarkElement Tree → Lark Tree → HCL2 Text
Direct: HCL2 Text → [PostLexer] → Lark Parse Tree → LarkElement Tree → Lark Tree → HCL2 Text
The Direct pipeline (parse_to_tree → transform → to_lark → reconstruct) skips serialization to dict, so all IR nodes (including NewLineOrCommentRule nodes for whitespace/comments) directly influence the reconstructed output. Any information discarded before the IR is lost in this pipeline.
| Module | Role |
|---|---|
hcl2/hcl2.lark |
Lark grammar definition |
hcl2/api.py |
Public API (load/loads/dump/dumps + intermediate stages) |
hcl2/postlexer.py |
Token stream transforms between lexer and parser |
hcl2/parser.py |
Lark parser factory with caching |
hcl2/transformer.py |
Lark parse tree → LarkElement tree |
hcl2/deserializer.py |
Python dict → LarkElement tree |
hcl2/formatter.py |
Whitespace alignment and spacing on LarkElement trees |
hcl2/reconstructor.py |
LarkElement tree → HCL2 text via Lark |
hcl2/builder.py |
Programmatic HCL document construction |
hcl2/utils.py |
SerializationOptions, SerializationContext, string helpers |
hcl2/const.py |
Constants: IS_BLOCK, COMMENTS_KEY, INLINE_COMMENTS_KEY |
cli/helpers.py |
File/directory/stdin conversion helpers |
cli/hcl_to_json.py |
hcl2tojson entry point |
cli/json_to_hcl.py |
jsontohcl2 entry point |
hcl2/__main__.py is a thin wrapper that imports cli.hcl_to_json:main.
| File | Domain |
|---|---|
rules/abstract.py |
LarkElement, LarkRule, LarkToken base classes |
rules/tokens.py |
StringToken (cached factory), StaticStringToken, punctuation constants |
rules/base.py |
StartRule, BodyRule, BlockRule, AttributeRule |
rules/containers.py |
TupleRule, ObjectRule, ObjectElemRule, ObjectElemKeyRule |
rules/expressions.py |
ExprTermRule, BinaryOpRule, UnaryOpRule, ConditionalRule |
rules/literal_rules.py |
IntLitRule, FloatLitRule, IdentifierRule, KeywordRule |
rules/strings.py |
StringRule, InterpolationRule, HeredocTemplateRule |
rules/functions.py |
FunctionCallRule, ArgumentsRule |
rules/indexing.py |
GetAttrRule, SqbIndexRule, splat rules |
rules/for_expressions.py |
ForTupleExprRule, ForObjectExprRule, ForIntroRule, ForCondRule |
rules/whitespace.py |
NewLineOrCommentRule, InlineCommentMixIn |
Follows the json module convention. All option parameters are keyword-only.
load/loads— HCL2 text → Python dictdump/dumps— Python dict → HCL2 text- Intermediate stages:
parse/parses,parse_to_tree/parses_to_tree,transform,serialize,from_dict,from_json,reconstruct
SerializationOptions (LarkElement → dict):
with_comments, with_meta, wrap_objects, wrap_tuples, explicit_blocks, preserve_heredocs, force_operation_parentheses, preserve_scientific_notation
DeserializerOptions (dict → LarkElement):
heredocs_to_strings, strings_to_heredocs, object_elements_colon, object_elements_trailing_comma
FormatterOptions (whitespace/alignment):
indent_length, open_empty_blocks, open_empty_objects, open_empty_tuples, vertically_align_attributes, vertically_align_object_elements
Console scripts defined in pyproject.toml. Each uses argparse flags that map directly to the option dataclass fields above.
hcl2tojson --json-indent 2 --with-meta file.tf
jsontohcl2 --indent 4 --no-align file.json
Add new options as parser.add_argument() calls in the relevant entry point module.
Lark's postlex parameter accepts a single object with a process(stream) method that transforms the token stream between the lexer and LALR parser. The PostLexer class is designed for extensibility: each transformation is a private method that accepts and yields tokens, and process() chains them together.
Current passes:
_merge_newlines_into_operators
To add a new pass: create a private method with the same (self, stream) -> generator signature, and add a yield from call in process().
These are project-specific constraints that must not be violated:
- Always use the LarkElement IR. Never transform directly from Lark parse tree to Python dict or vice versa.
- Block vs object distinction. Use
__is_block__markers (const.IS_BLOCK) to preserve semantic intent during round-trips. The deserializer must distinguish blocks from regular objects. - Bidirectional completeness. Every serialization path must have a corresponding deserialization path. Test round-trip integrity: Parse → Serialize → Deserialize → Serialize produces identical results.
- One grammar rule = one
LarkRuleclass. Each class implementslark_name(), typed property accessors,serialize(), and declares_children_layout: Tuple[...](annotation only, no assignment) to document child structure. - Token caching. Use the
StringTokenfactory inrules/tokens.py— never create token instances directly. - Interpolation context.
${...}generation depends on nesting depth — always pass and respectSerializationContext. - Update both directions. When adding language features, update transformer.py, deserializer.py, formatter.py and reconstructor.py.
- Add grammar rules to
hcl2.lark - If the new construct creates LALR ambiguities with
NL_OR_COMMENT, add a postlexer pass inpostlexer.py - Create rule class(es) in the appropriate
rules/file - Add transformer method(s) in
transformer.py - Implement
serialize()in the rule class - Update
deserializer.py,formatter.pyandreconstructor.pyfor round-trip support
Framework: unittest.TestCase (not pytest).
python -m unittest discover -s test -p "test_*.py" -v
Unit tests (test/unit/): instantiate rule objects directly (no parsing).
rules/— one file per rules modulecli/— one file per CLI moduletest_*.py— tests for corresponding files fromhcl2/directory
Use concrete stubs when testing ABCs (e.g., StubExpression(ExpressionRule)).
Integration tests (test/integration/): full-pipeline tests with golden files.
test_round_trip.py— iterates over all suites inhcl2_original/, tests HCL→JSON, JSON→JSON, JSON→HCL, and full round-triptest_specialized.py— feature-specific tests with golden files inspecialized/
Always run round-trip full test suite after any modification.
Hooks are defined in .pre-commit-config.yaml (includes black, mypy, pylint, and others). All changed files must pass these checks before committing. When writing or modifying code:
- Format Python with black (Python 3.8 target).
- Ensure mypy and pylint pass. Pylint config is in
pylintrc, scoped tohcl2/andtest/. - End files with a newline; strip trailing whitespace (except under
test/integration/(hcl2_reconstructed|specialized)/).
Update this file when architecture, modules, API surface, or testing conventions change. Also update README.md and docs/usage.md when changes affect the public API, CLI flags, or option fields.