You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This converts the HTML table to a markdown table which allows for
including links in cell whereas this does not seem to work in HTML
tables.
Signed-off-by: Georg Kunz <[email protected]>
Copy file name to clipboardExpand all lines: docs/Correctly-Using-Regular-Expressions.md
+10-93Lines changed: 10 additions & 93 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,99 +23,16 @@ When using regexes for secure validation of untrusted input, do the following so
23
23
1. If there are any branches (“|”), make sure the alternatives are grouped. You can do this by surrounding them with parentheses like this: “(aa|bb)”. If you don’t need the groups to be captured (you usually don’t), and your platform supports non-capturing groups (most do), it’s usually more efficient to use non-capturing groups - just change “(“ into “(?:”
24
24
2. Use a regular expression in its normal mode (not “multiline” mode). Prepend a start-of-string marking (often “^” or “\A”) and append an “end-of-string” marking (often “$” or “\z”, but Python uses “\Z”). Do _not_ use “$” for input validation until you verify that “$” does what you want. See this table for many common platforms:
25
25
26
-
<table>
27
-
<tr>
28
-
<td>
29
-
Platform
30
-
</td>
31
-
<td>Prepend
32
-
</td>
33
-
<td>Append
34
-
</td>
35
-
<td>$ Permissive?
36
-
</td>
37
-
</tr>
38
-
<tr>
39
-
<td>POSIX BRE, POSIX ERE, and ECMAScript (JavaScript)
40
-
</td>
41
-
<td>“^” (not “\A”)
42
-
</td>
43
-
<td>“$” (not “\z” nor “\Z”)
44
-
</td>
45
-
<td>No
46
-
</td>
47
-
</tr>
48
-
<tr>
49
-
<td>Perl, .NET/C#
50
-
</td>
51
-
<td>“^” or “\A”
52
-
</td>
53
-
<td>“\z” (not “$”)
54
-
</td>
55
-
<td>Yes
56
-
</td>
57
-
</tr>
58
-
<tr>
59
-
<td>Java
60
-
</td>
61
-
<td>“^” or “\A”
62
-
</td>
63
-
<td>“\z”; [“$” works but some documents conflict](./Correctly-Using-Regular-Expressions-Rationale#java)
| Java | “^” or “\A” | “\z”; [“$” works but some documents conflict](./Correctly-Using-Regular-Expressions-Rationale#java)| No |
31
+
| PHP | “^” or “\A” | “\z”; “$” with “D” modifier | Yes |
32
+
| PCRE | “^” or “\A” | “\z”; “$” with PCRE2_ DOLLAR_ENDONLY | Yes |
33
+
| Golang, Rust crate regex, and RE2 | “^” or “\A” | “\z” or “$” | No |
34
+
| Python | “^” or “\A” | “\Z” (not “$” nor “\z”) | Yes |
35
+
| Ruby | “\A” (not “^”) | “\z” (not “$”) | Yes |
119
36
120
37
For example, to validate in JavaScript that the input is only “ab” or “de”, use the regex “<tt>^(ab|de)$</tt>”. To validate the same thing in Python, use “<tt>^(ab|de)\Z</tt>” or “<tt>\A(ab|de)\Z</tt>”. Note that the “$” anchor has different meanings among platforms and is often misunderstood; on many platforms it’s permissive by default and doesn’t match only the end of the input. Instead of using “$” on a platform if $ is permissive, consider using an explicit form instead (e.g., “`\n?\z`”). Consider preferring “\A” and “\z” where it’s supported (this is necessary when using Ruby).
0 commit comments