Skip to content

Commit cbcd16f

Browse files
committed
lxml/libxml2 no more drops control characters while parsing HTML input
Newer libxml2 like libxml2-2.14 encodes and keeps control characters as neutralized form. It eliminates some evil attempts. Adds fixup to expected result text to pass doctest.
1 parent 7af4b51 commit cbcd16f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tests/test_clean.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@
8484
<body onload="evil_function()">
8585
<!-- I am interpreted for EVIL! -->
8686
<a href="javascript:evil_function()">a link</a>
87-
<a href="javascrip%20t%20:evil_function()">a control char link</a>
87+
<a href="j%01a%02v%03a%04s%05c%06r%07i%0Ep%20t%20:evil_function()">a control char link</a>
8888
<a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a>
8989
<a href="#" onclick="evil_function()">another link</a>
9090
<p onclick="evil_function()">a paragraph</p>

0 commit comments

Comments
 (0)