Skip to content

Commit 54a8f9e

Browse files
committed
Swift: Copy qhelp from Ruby.
1 parent 4a46946 commit 54a8f9e

File tree

3 files changed

+82
-0
lines changed

3 files changed

+82
-0
lines changed
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
2+
<qhelp>
3+
<include src="ReDoSIntroduction.inc.qhelp" />
4+
<example>
5+
<p>Consider this regular expression:</p>
6+
<sample language="ruby">
7+
/^_(__|.)+_$/</sample>
8+
<p>
9+
Its sub-expression <code>"(__|.)+?"</code> can match the string
10+
<code>"__"</code> either by the first alternative <code>"__"</code> to the
11+
left of the <code>"|"</code> operator, or by two repetitions of the second
12+
alternative <code>"."</code> to the right. Thus, a string consisting of an
13+
odd number of underscores followed by some other character will cause the
14+
regular expression engine to run for an exponential amount of time before
15+
rejecting the input.
16+
</p>
17+
<p>
18+
This problem can be avoided by rewriting the regular expression to remove
19+
the ambiguity between the two branches of the alternative inside the
20+
repetition:
21+
</p>
22+
<sample language="ruby">
23+
/^_(__|[^_])+_$/</sample>
24+
</example>
25+
<include src="ReDoSReferences.inc.qhelp"/>
26+
</qhelp>
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
2+
<qhelp>
3+
<overview>
4+
<p>
5+
Some regular expressions take a long time to match certain input strings
6+
to the point where the time it takes to match a string of length <i>n</i>
7+
is proportional to <i>n<sup>k</sup></i> or even <i>2<sup>n</sup></i>.
8+
Such regular expressions can negatively affect performance, or even allow
9+
a malicious user to perform a Denial of Service ("DoS") attack by crafting
10+
an expensive input string for the regular expression to match.
11+
</p>
12+
<p>
13+
The regular expression engine used by the Ruby interpreter (MRI) uses
14+
backtracking non-deterministic finite automata to implement regular
15+
expression matching. While this approach is space-efficient and allows
16+
supporting advanced features like capture groups, it is not time-efficient
17+
in general. The worst-case time complexity of such an automaton can be
18+
polynomial or even exponential, meaning that for strings of a certain
19+
shape, increasing the input length by ten characters may make the
20+
automaton about 1000 times slower.
21+
</p>
22+
<p>
23+
Note that Ruby 3.2 and later have implemented a caching mechanism that
24+
completely eliminates the worst-case time complexity for the regular
25+
expressions flagged by this query. The regular expressions flagged by this
26+
query are therefore only problematic for Ruby versions prior to 3.2.
27+
</p>
28+
<p>
29+
Typically, a regular expression is affected by this problem if it contains
30+
a repetition of the form <code>r*</code> or <code>r+</code> where the
31+
sub-expression <code>r</code> is ambiguous in the sense that it can match
32+
some string in multiple ways. More information about the precise
33+
circumstances can be found in the references.
34+
</p>
35+
</overview>
36+
<recommendation>
37+
<p>
38+
Modify the regular expression to remove the ambiguity, or ensure that the
39+
strings matched with the regular expression are short enough that the
40+
time-complexity does not matter.
41+
</p>
42+
</recommendation>
43+
</qhelp>
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
2+
<qhelp>
3+
<references>
4+
<li> OWASP:
5+
<a href="https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.
6+
</li>
7+
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/ReDoS">ReDoS</a>.</li>
8+
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/Time_complexity">Time complexity</a>.</li>
9+
<li>James Kirrage, Asiri Rathnayake, Hayo Thielecke:
10+
<a href="http://www.cs.bham.ac.uk/~hxt/research/reg-exp-sec.pdf">Static Analysis for Regular Expression Denial-of-Service Attack</a>.
11+
</li>
12+
</references>
13+
</qhelp>

0 commit comments

Comments
 (0)