Skip to content

Commit e4c3037

Browse files
authored
Merge pull request #13557 from am0o0/amammad-python-bombs
Python: Decompression Bombs
2 parents a9abba5 + 09d8a75 commit e4c3037

File tree

11 files changed

+742
-0
lines changed

11 files changed

+742
-0
lines changed
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<!DOCTYPE qhelp PUBLIC
2+
"-//Semmle//qhelp//EN"
3+
"qhelp.dtd">
4+
<qhelp>
5+
<overview>
6+
<p>Extracting Compressed files with any compression algorithm like gzip can cause to denial of service attacks.</p>
7+
<p>Attackers can compress a huge file which created by repeated similiar byte and convert it to a small compressed file.</p>
8+
9+
</overview>
10+
<recommendation>
11+
12+
<p>When you want to decompress a user-provided compressed file you must be careful about the decompression ratio or read these files within a loop byte by byte to be able to manage the decompressed size in each cycle of the loop.</p>
13+
14+
</recommendation>
15+
<example>
16+
<p>python ZipFile library is vulnerable by default</p>
17+
<sample src="example_bad.py" />
18+
19+
<p>By checking the decompressed size from input zipped file you can check the decompression ratio. attackers can forge this decompressed size header too.
20+
So can't rely on file_size attribute of ZipInfo class. this is recommended to use "ZipFile.open" method to be able to manage decompressed size.</p>
21+
<p>Reading decompressed file byte by byte and verifying the total current size in each loop cycle in recommended to use in any decompression library.</p>
22+
<sample src="example_good.py" />
23+
</example>
24+
<references>
25+
26+
<li>
27+
<a href="https://nvd.nist.gov/vuln/detail/CVE-2023-22898">CVE-2023-22898</a>
28+
</li>
29+
<li>
30+
<a href="https://www.bamsoftware.com/hacks/zipbomb/">A great research to gain more impact by this kind of attack</a>
31+
</li>
32+
33+
</references>
34+
</qhelp>
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
/**
2+
* @name Decompression Bomb
3+
* @description Uncontrolled data that flows into decompression library APIs without checking the compression rate is dangerous
4+
* @kind path-problem
5+
* @problem.severity error
6+
* @security-severity 7.8
7+
* @precision high
8+
* @id py/decompression-bomb
9+
* @tags security
10+
* experimental
11+
* external/cwe/cwe-409
12+
*/
13+
14+
import python
15+
import experimental.semmle.python.security.DecompressionBomb
16+
import BombsFlow::PathGraph
17+
18+
from BombsFlow::PathNode source, BombsFlow::PathNode sink
19+
where BombsFlow::flowPath(source, sink)
20+
select sink.getNode(), source, sink, "This uncontrolled file extraction is $@.", source.getNode(),
21+
"depends on this user controlled data"
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
import zipfile
2+
3+
4+
def Bad(zip_path):
5+
zipfile.ZipFile(zip_path, "r").extractall()
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
import zipfile
2+
3+
4+
def safeUnzip(zipFileName):
5+
'''
6+
safeUnzip reads each file inside the zipfile 1 MB by 1 MB
7+
and during extraction or reading of these files it checks the total decompressed size
8+
doesn't exceed the SIZE_THRESHOLD
9+
'''
10+
buffer_size = 1024 * 1024 * 1 # 1 MB
11+
total_size = 0
12+
SIZE_THRESHOLD = 1024 * 1024 * 10 # 10 MB
13+
with zipfile.ZipFile(zipFileName) as myzip:
14+
for fileinfo in myzip.infolist():
15+
with myzip.open(fileinfo.filename, mode="r") as myfile:
16+
content = b''
17+
chunk = myfile.read(buffer_size)
18+
total_size += buffer_size
19+
if total_size > SIZE_THRESHOLD:
20+
print("Bomb detected")
21+
return False # it isn't a successful extract or read
22+
content += chunk
23+
# reading next bytes of uncompressed data
24+
while chunk:
25+
chunk = myfile.read(buffer_size)
26+
total_size += buffer_size
27+
if total_size > SIZE_THRESHOLD:
28+
print("Bomb detected")
29+
return False # it isn't a successful extract or read
30+
content += chunk
31+
32+
# An example of extracting or reading each decompressed file here
33+
print(bytes.decode(content, 'utf-8'))
34+
return True # it is a successful extract or read

0 commit comments

Comments
 (0)