Skip to content

re.findall bug parsing "\n" literals??? #125157

@agowa

Description

@agowa

Bug report

Bug description:

Hi, why does the 2nd command match correctly but the 1st one doesn't?

import re

re.findall(r"(?:^|\n).*\tHEAD\n", 'foo\tHEAD\noo\tHEAD\n')
['foo\tHEAD\n']
re.findall(r"(?:^|\n).*\tHEAD\n", 'foo\tHEAD\n\noo\tHEAD\n')
['foo\tHEAD\n', '\noo\tHEAD\n']

Visualization of what the regex should do (yes, the match all character section in the middle is technically not exactly what I want, but just for simplicity as it is not related to this bug):
image

Edit: Cause findall does non-overlapping matches. However this is still strange, shouldn't at least one of these not match anything?:

re.findall(r"(?:^|\n).*\tHEAD\n", 'foo\tHEAD\n\noo\tHEAD\n')
['foo\tHEAD\n', '\noo\tHEAD\n']
re.findall("(?:^|\n).*\tHEAD\n", 'foo\tHEAD\n\noo\tHEAD\n')
['foo\tHEAD\n', '\noo\tHEAD\n']

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions