You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Correctly-Using-Regular-Expressions.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,16 +23,16 @@ When using regexes for secure validation of untrusted input, do the following so
23
23
1. If there are any branches (“|”), make sure the alternatives are grouped. You can do this by surrounding them with parentheses like this: “(aa|bb)”. If you don’t need the groups to be captured (you usually don’t), and your platform supports non-capturing groups (most do), it’s usually more efficient to use non-capturing groups - just change “(“ into “(?:”
24
24
2. Use a regular expression in its normal mode (not “multiline” mode). Prepend a start-of-string marking (often “^” or “\A”) and append an “end-of-string” marking (often “$” or “\z”, but Python uses “\Z”). Do _not_ use “$” for input validation until you verify that “$” does what you want. See this table for many common platforms:
| Java | “^” or “\A” | “\z”; [“$” works but some documents conflict](./Correctly-Using-Regular-Expressions-Rationale#java)| No |
31
+
| PHP | “^” or “\A” | “\z”; “$” with “D” modifier | Yes |
32
+
| PCRE | “^” or “\A” | “\z”; “$” with PCRE2_ DOLLAR_ENDONLY | Yes |
33
+
| Golang, Rust crate regex, and RE2 | “^” or “\A” | “\z” or “$” | No |
34
+
| Python | “^” or “\A” | “\Z” (not “$” nor “\z”) | Yes |
35
+
| Ruby | “\A” (not “^”) | “\z” (not “$”) | Yes |
36
36
37
37
For example, to validate in JavaScript that the input is only “ab” or “de”, use the regex “<tt>^(ab|de)$</tt>”. To validate the same thing in Python, use “<tt>^(ab|de)\Z</tt>” or “<tt>\A(ab|de)\Z</tt>”. Note that the “$” anchor has different meanings among platforms and is often misunderstood; on many platforms it’s permissive by default and doesn’t match only the end of the input. Instead of using “$” on a platform if $ is permissive, consider using an explicit form instead (e.g., “`\n?\z`”). Consider preferring “\A” and “\z” where it’s supported (this is necessary when using Ruby).
0 commit comments