-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add Regex.to_embed/2 #14379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Regex.to_embed/2 #14379
Conversation
|
Thank you for the PR. Given
So my suggestion is to add a single function, called |
7ab69cd to
0bef2c1
Compare
|
Hi, I have updated the change as you requested. I made one tweak, adding a "strict" option for FWIW, More modern versions of PCRE may one day support more options that just 'imsx'. Perl itself supports 'u' in embeddable form (although it has slightly different meaning, in Elixir/Erlang/PCRE the /u flag means "string encoded as unicode" and also "use unicode semantics". In the Perl /u means "use unciode semantics regardless of the encoding". This is why the exceptions mentions the current version of PCRE. Hope this is what you had in mind with your feedback! |
0bef2c1 to
092407e
Compare
to_embed(regex,strict) returns an embeddable representation of regex. For instance ~r/foo/i can be represented as ~r/(?i-msx:foo)/. If the option :strict is true (the default) then it will throw an ArgumentError if the regex was compiled with an option/modifier which cannot be represented as an embeddable pattern. If :strict is false then any unembeddable options will be silently ignored. This may be perfectly reasonable, for intance the wrapped pattern may be compiled with the same modifiers as the pattern, or reusing the pattern without the unembeddable modifiers may not change its semantics.
092407e to
9bce632
Compare
|
Looks great, I have dropped only some minor suggestions now and we can ship it! |
Minor fixups and simplifications. Co-authored-by: José Valim <[email protected]>
* Sentences should not start with 'And'. * Rework sentence about unlisted regex compile options. * Consistent formatting for the the 'strict' option.
also add comment about why we sort the modifiers
|
Suggestions turned to commits, with one caveat about a compromise wording as noted, and I followed up on your point about to_string(). I didnt squash so its easier for you to review, you said previously you didnt mind doing that yourself. |
|
💚 💙 💜 💛 ❤️ |
This patch, which works on 1.18.x but does not work on 1.19 is an attempt to implement String.Chars protocol and also a Regex.to_string() and Regex.modifiers() and Regex.to_string!() and Regex.modifiers!() functions.
The idea is to make it possible to safely embed precompiled regexes into other regexes in a similar way as that supported by perl. The general idea is that ~r/foo/x turns into "(?x-ims:foo\n)", and etc. Thus it should match the same as it would have in its original form when it is embedded into a pattern which has a different set of modifiers.
For review by Jose.