Skip to content

Streamline long term maintenance of lexer/parser code generators #264

@alecthomas

Description

@alecthomas

I have some concerns with the long term maintenance of both the lexer and parser code generators which we should solve (or at least, not make worse) before getting these changes into master.

Problem 1: divergence

My first concern is that as the runtime code evolves, it diverges from the code generators. This has already occurred with the lexer, where the generated code does not handle lookahead. There will need to be some tests to detect this somehow. For the parser, enumerating the nodes and ensuring they're all handled by the generator (somehow) might work.

  • Conformance tests for lexer
  • Conformance tests for parser

Problem 2: ergonomics

There are also some ergonomic issues with generating the code. Specifically, having to have an implementation of the runtime code lying around as the "source of truth".

Proposal: serialisable form for both lexers and parse trees

  • Serialisable lexer
  • Create participle gen lexer
  • Serialisable parser
  • Create participle gen parser

One solution to this is for the code generators to be decoupled completely from the code. The lexer and parser would be extended to be JSON marshallable and the code generators would become standalone binaries that could ingest this serialised form and output code. This might be non-trivial for the parser because it is tightly coupled to reflection - TBD.

Another option would be to make standalone binaries that parse the Go code directly, making the Go AST and compile-time type system the intermediate form. This would be much more complicated though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions