Skip to content

A source of inefficiency in LR(1) parsers. #11

@modulovalue

Description

@modulovalue

Hello @osa1!

There's one observation that I made about a source of inefficiencies in traditional LR(1) parsers here.

I'd love to hear your thoughts on it. I hope to find an algorithm that allows me to split states that would not introduce this redundancy, that is, one that doesn't need a post processing step and only splits states that are necessary.

(below is a copy of the linked comment:)


This grammar demonstrates a source of inefficiency in the automaton of the standard LR(1) construction.

Looking at the LR(1) automaton, I think that state 13 & 18 and state 17 & 12 could be merged and the resulting automaton would still resolve the ambiguity that LR(1) was meant to resolve. only 6 & 9 and 14 & 19 are required to be distinct new states.

This grammar is given as an example in some slides that introduce langcc. However, I'm not sure if langcc eliminates this source of inefficiency.

I don't know what this observation is called in the literature, but I would be interested in finding that out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions