Skip to content

Conversation

@Multimodcrafter
Copy link
Collaborator

Not to be merged; this is simply to trigger CI and check if everyting is ok before we open the PR to the main repo

Multimodcrafter and others added 30 commits March 5, 2025 12:42
This is the first step to supporting captureless lookbehind assertions
The lack of recursing into the inner expression of a lookaround is correct under the current assumption that lookarounds cannot have capture groups. But once the restriction is lifted, this wrong implementation can be very subtle to find. Instead, we can already do the filtering and accept it being a no-op for now.
This makes it consistent with parser's ErrorKind::UnsupportedLookAround.
We require two vm instructions 'CheckLookaround' and 'WriteLookaround'
to be able to track the state of lookaround expressions at the
current position in the haystack. Both instructions access
a new 'lookaround' vector of booleans, which contains one entry
per lookaround expression in the regex.
These changes implement the compilation of lookaround assertions from
HIR to NFA. Subexpressions of lookaround assertions are patched to
a top level reverse union. This is necessary so that the NFA will
explore the innermost subexpression first and thereby make sure that
all subexpression results are available when they need to be checked.
I.e. any `WriteLookaround` state must be visited before any
`CheckLookaround` state with the same index.
The machinery necessary to perform the parallel lookbehind checking
should only be compiled in when there is actually a lookbehind expression
in the regex. This restores compilation to the expected outputs for
regexes without lookbehind expressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants