Skip to content

Problem matching word boundaries (\b) at the end of a string #385

@nsmmrs

Description

@nsmmrs

Describe the bug
When using regex matching, \b doesn't seem to match word boundaries at the end of a matched string.

Reproducing
See example 5 here: https://bit.ly/4hpMYXb

Example 6 demonstrates that word-boundaries at the beginning of a string match just fine.

Removing the second word-boundary pattern fixes example 5, but breaks example 6:
https://bit.ly/41OGBqG

Expected behavior
I've tried a few different engines, and they all seem to respect word-boundaries which are also end-of-strings. For example, Ruby:

"abc 123 xyz".scan(/\b\w+\b/) #=> ["abc", "123", "xyz"]

Also, not sure if this is analogous, but given a file with no trailing newline:

IO.read("regex-test.txt") #=> "abc 123 xyz"

rg gives this output:

rg '\b\w+\b' regex-test.txt -o
1:abc
1:123
1:xyz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions