Actually document the `regex` dialect and semantics

While many regex dialects / implementations use similar symbols they don't necessarily ascribe the same semantics to those e.g.  `\d`, `w`, `\s` and their reverse may be ascii only or partially or fully unicode, the latter would be a lot more expensive than the former, possibly unnecessarily.

Furthermore from a performance / memory standpoint 6e65445448586361f6c5b3389ef619249d56e070 modified regexes to limit redos risk, however it did so inconsistently so it's not entirely clear whether and which rules non-backtracking engines which are *not* sensitive to catastrophic backtracking (e.g. [re2](https://github.com/google/re2), [regex](https://docs.rs/regex/latest/regex/), [regexp](https://pkg.go.dev/regexp), ...) may convert the regexes back to unbounded repetition, as bounded repetitions are also used in semantically relevant contexts. Having a *well defined and consistent* substitute for `*` and `+` (and maybe some rules ensuring new ones don't get added improperly) would allow engines to track and substitute them on the fly, which can positively impact their memory use and runtime as they don't need to track the number of iterations anymore.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Actually document the `regex` dialect and semantics #594

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Actually document the regex dialect and semantics #594

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Actually document the `regex` dialect and semantics #594