Skip to content

<regex>: Loops with bounded number of repetitions and context-dependent empty alternative are mishandled #5797

@muellerj2

Description

@muellerj2

Describe the bug

If an ECMAScript regex contains a loop with a bounded number of repetitions and a context-dependent empty alternative (achieved through a lookahead assertion), the matcher might fail to match the regex to some inputs even though it should.

Test case

#include <iostream>
#include <regex>
#include <string>

using namespace std;

int main() {
    regex re("(?:b|c|(?=bc)){3}");
    cout << "regex '(?:b|c|(?=bc)){3}' matches 'bc': " << regex_match("bc", re) << "\n";
    return 0;
}

This program prints regex '(?:b|c|(?=bc)){3}' matches 'bc': 0.

Expected behavior

The regex should match:

  • The first repetition matches the empty string, since the lookahead assertion succeeds.
  • The second repetition matches 'b'.
  • The third repetition matches 'c'.

STL version

Probably since the first release that included <regex> up until current head (but briefly concealed by #5792).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregexmeow is a substring of homeowner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions