Skip to content

Commit a536069

Browse files
authored
Merge pull request github#3408 from esbena/js/unsafe-html-expansion
Approved by asgerf, mchammer01
2 parents c06680a + c6fa88a commit a536069

File tree

9 files changed

+216
-0
lines changed

9 files changed

+216
-0
lines changed

change-notes/1.25/analysis-javascript.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
|---------------------------------------------------------------------------------|-------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1616
| Cross-site scripting through DOM (`js/xss-through-dom`) | security, external/cwe/cwe-079, external/cwe/cwe-116 | Highlights potential XSS vulnerabilities where existing text from the DOM is used as HTML. Results are not shown on LGTM by default. |
1717
| Incomplete HTML attribute sanitization (`js/incomplete-html-attribute-sanitization`) | security, external/cwe/cwe-20, external/cwe/cwe-079, external/cwe/cwe-116 | Highlights potential XSS vulnerabilities due to incomplete sanitization of HTML meta-characters. Results are shown on LGTM by default. |
18+
| Unsafe expansion of self-closing HTML tag (`js/unsafe-html-expansion`) | security, external/cwe/cwe-079, external/cwe/cwe-116 | Highlights potential XSS vulnerabilities caused by unsafe expansion of self-closing HTML tags. |
1819

1920
## Changes to existing queries
2021

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
6+
<overview>
7+
<p>
8+
9+
Sanitizing untrusted input for HTML meta-characters is an
10+
important technique for preventing cross-site scripting attacks. But
11+
even a sanitized input can be dangerous to use if it is modified
12+
further before a browser treats it as HTML.
13+
14+
A seemingly innocent transformation that expands a
15+
self-closing HTML tag from <code>&lt;div attr="{sanitized}"/&gt;</code>
16+
to <code>&lt;div attr="{sanitized}"&gt;&lt;/div&gt;</code> may
17+
in fact cause cross-site scripting vulnerabilities.
18+
19+
</p>
20+
21+
</overview>
22+
23+
<recommendation>
24+
<p>
25+
26+
Use a well-tested sanitization library if at all
27+
possible, and avoid modifying sanitized values further before treating
28+
them as HTML.
29+
30+
</p>
31+
</recommendation>
32+
33+
<example>
34+
35+
<p>
36+
37+
The following function transforms a self-closing HTML tag
38+
to a pair of open/close tags. It does so for all non-<code>img</code>
39+
and non-<code>area</code> tags, by using a regular expression with two
40+
capture groups. The first capture group corresponds to the name of the
41+
tag, and the second capture group to the content of the tag.
42+
43+
</p>
44+
45+
<sample src="examples/UnsafeHtmlExpansion.js" />
46+
47+
<p>
48+
49+
While it is generally known regular expressions are
50+
ill-suited for parsing HTML, variants of this particular transformation
51+
pattern have long been considered safe.
52+
53+
</p>
54+
55+
<p>
56+
57+
However, the function is not safe. As an example, consider
58+
the following string:
59+
60+
61+
</p>
62+
63+
<sample src="examples/UnsafeHtmlExpansion-original.html" />
64+
65+
<p>
66+
67+
When the above function transforms the string, it becomes
68+
a string that results in an alert when a browser treats it as HTML.
69+
70+
</p>
71+
72+
<sample src="examples/UnsafeHtmlExpansion-transformed.html" />
73+
74+
</example>
75+
76+
<references>
77+
<li>jQuery:
78+
<a href="https://blog.jquery.com/2020/04/10/jquery-3-5-0-released/">Security fixes in jQuery 3.5.0</a>
79+
</li>
80+
<li>
81+
OWASP:
82+
<a href="https://cheatsheetseries.owasp.org/cheatsheets/DOM_based_XSS_Prevention_Cheat_Sheet.html">DOM based
83+
XSS Prevention Cheat Sheet</a>.
84+
</li>
85+
<li>
86+
OWASP:
87+
<a href="https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html">XSS
88+
(Cross Site Scripting) Prevention Cheat Sheet</a>.
89+
</li>
90+
<li>
91+
OWASP
92+
<a href="https://owasp.org/www-community/Types_of_Cross-Site_Scripting">Types of Cross-Site</a>.
93+
</li>
94+
<li>
95+
Wikipedia: <a href="http://en.wikipedia.org/wiki/Cross-site_scripting">Cross-site scripting</a>.
96+
</li>
97+
</references>
98+
99+
</qhelp>
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
/**
2+
* @name Unsafe expansion of self-closing HTML tag
3+
* @description Using regular expressions to expand self-closing HTML
4+
* tags may lead to cross-site scripting vulnerabilities.
5+
* @kind problem
6+
* @problem.severity warning
7+
* @precision very-high
8+
* @id js/unsafe-html-expansion
9+
* @tags correctness
10+
* security
11+
* external/cwe/cwe-079
12+
* external/cwe/cwe-116
13+
*/
14+
15+
import javascript
16+
17+
/**
18+
* A regular expression that captures the name and content of a
19+
* self-closing HTML tag such as `<div id='foo'/>`.
20+
*/
21+
class SelfClosingTagRecognizer extends DataFlow::RegExpCreationNode {
22+
SelfClosingTagRecognizer() {
23+
exists(RegExpSequence seq, RegExpGroup name, RegExpGroup content |
24+
// `/.../g`
25+
RegExp::isGlobal(this.getFlags()) and
26+
this.getRoot() = seq.getRootTerm() and
27+
// `/<.../`
28+
seq.getChild(0).getConstantValue() = "<" and
29+
// `/...\/>/`
30+
seq.getLastChild().getPredecessor().getConstantValue() = "/" and
31+
seq.getLastChild().getConstantValue() = ">" and
32+
// `/...((...)...).../`
33+
seq.getAChild() = content and
34+
content.getNumber() = 1 and
35+
name.getNumber() = 2 and
36+
name = content.getChild(0).(RegExpSequence).getChild(0) and
37+
// `/...(([a-z]+)...).../` or `/...(([a-z][...]*)...).../`
38+
exists(RegExpQuantifier quant | name.getAChild*() = quant |
39+
quant instanceof RegExpStar or
40+
quant instanceof RegExpPlus
41+
) and
42+
// `/...((...)[^>]*).../`
43+
exists(RegExpCharacterClass lazy |
44+
name.getSuccessor().(RegExpStar).getChild(0) = lazy and
45+
lazy.isInverted() and
46+
lazy.getAChild().getConstantValue() = ">"
47+
)
48+
)
49+
}
50+
}
51+
52+
from SelfClosingTagRecognizer regexp, StringReplaceCall replace
53+
where
54+
regexp.getAReference().flowsTo(replace.getArgument(0)) and
55+
replace.getRawReplacement().mayHaveStringValue("<$1></$2>")
56+
select replace,
57+
"This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value.",
58+
regexp, "this regular expression"
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
<div alt="
2+
<x" title="/>
3+
<img src=url404 onerror=alert(1)>"/>
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
<img alt="
2+
<x" title="></x" >
3+
<img src=url404 onerror=alert(1)>"/>
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
function expandSelfClosingTags(html) {
2+
var rxhtmlTag = /<(?!img|area)(([a-z][^\w\/>]*)[^>]*)\/>/gi;
3+
return html.replace(rxhtmlTag, "<$1></$2>"); // BAD
4+
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
| UnsafeHtmlExpansion.js:6:2:9:2 | html.re ... nded\\n\\t) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:7:3:7:95 | /<(?!ar ... )\\/>/gi | this regular expression |
2+
| UnsafeHtmlExpansion.js:10:2:10:68 | html.re ... panded) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:10:15:10:57 | /<(([a- ... )\\/>/gi | this regular expression |
3+
| UnsafeHtmlExpansion.js:13:2:16:2 | html.re ... nded\\n\\t) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:14:3:14:75 | /<(?!ar ... )\\/>/gi | this regular expression |
4+
| UnsafeHtmlExpansion.js:17:2:17:48 | html.re ... panded) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:17:15:17:37 | /<(([\\w ... )\\/>/gi | this regular expression |
5+
| UnsafeHtmlExpansion.js:20:2:23:2 | html.re ... nded\\n\\t) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:21:3:21:76 | /<(?!ar ... )\\/>/gi | this regular expression |
6+
| UnsafeHtmlExpansion.js:24:2:24:49 | html.re ... panded) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:24:15:24:38 | /<(([\\w ... )\\/>/gi | this regular expression |
7+
| UnsafeHtmlExpansion.js:26:2:26:39 | html.re ... panded) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:2:23:2:45 | /<(([\\w ... )\\/>/gi | this regular expression |
8+
| UnsafeHtmlExpansion.js:30:2:30:37 | html.re ... panded) | This self-closing HTML tag expansion invalidates prior sanitization as $@ may match part of an attribute value. | UnsafeHtmlExpansion.js:2:23:2:45 | /<(([\\w ... )\\/>/gi | this regular expression |
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
(function(){
2+
let defaultPattern = /<(([\w:]+)[^>]*)\/>/gi;
3+
let expanded = "<$1></$2>";
4+
5+
// lib1
6+
html.replace(
7+
/<(?!area|br|col|embed|hr|img|input|link|meta|param)(([a-z][^\/\0>\x20\t\r\n\f]*)[^>]*)\/>/gi,
8+
expanded
9+
); // NOT OK
10+
html.replace(/<(([a-z][^\/\0>\x20\t\r\n\f]*)[^>]*)\/>/gi, expanded); // NOT OK
11+
12+
// lib2
13+
html.replace(
14+
/<(?!area|br|col|embed|hr|img|input|link|meta|param)(([\w:]+)[^>]*)\/>/gi,
15+
expanded
16+
); // NOT OK
17+
html.replace(/<(([\w:]+)[^>]*)\/>/gi, expanded); // NOT OK
18+
19+
// lib3
20+
html.replace(
21+
/<(?!area|br|col|embed|hr|img|input|link|meta|param)(([\w:-]+)[^>]*)\/>/gi,
22+
expanded
23+
); // NOT OK
24+
html.replace(/<(([\w:-]+)[^>]*)\/>/gi, expanded); // NOT OK
25+
26+
html.replace(defaultPattern, expanded); // NOT OK
27+
function getPattern() {
28+
return defaultPattern;
29+
}
30+
html.replace(getPattern(), expanded); // NOT OK
31+
32+
function getExpanded() {
33+
return expanded;
34+
}
35+
html.replace(defaultPattern, getExpanded()); // NOT OK (but not tracking the expansion string)
36+
html.replace(defaultPattern, something); // OK (possibly)
37+
defaultPattern.match(something); // OK (possibly)
38+
getPattern().match(something); // OK (possibly)
39+
});
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Security/CWE-116/UnsafeHtmlExpansion.ql

0 commit comments

Comments
 (0)