Skip to content

Commit cfa761f

Browse files
committed
fix(skip_list): extend default skip list with file types generating noise due to the presence of compressed streams.
The following magic are now skipped by default: - Macromedia Flash data (swf files holding zlib streams) - MPEG (audio/video files holding compressed streams) - Printer Job Language (HP pjl files with compressed content) - Erlang BEAM file - Microsoft OOXML (open document format, can contain compressed streams) - PE32 (extended the rule to cover any kind of PE file) - python (specifically byte-compiled python files like .pyo and .pyc as they can contain compressed streams) - Composite Document File V2 Document (Thumbs.db files) - Windows Embedded CE binary image (will be removed when we have a dedicated handler)
1 parent ef4c909 commit cfa761f

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

unblob/processing.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,14 +53,22 @@
5353
"PDF document",
5454
"magic binary file",
5555
"MS Windows icon resource",
56-
"PE32+ executable (EFI application)",
56+
"PE32",
5757
"Web Open Font Format",
5858
"GNU message catalog",
5959
"Xilinx BIT data",
6060
"Microsoft Excel",
6161
"Microsoft Word",
6262
"Microsoft PowerPoint",
63+
"Microsoft OOXML",
6364
"OpenDocument",
65+
"Macromedia Flash data",
66+
"MPEG",
67+
"HP Printer Job Language",
68+
"Erlang BEAM file",
69+
"python", # (e.g. python 2.7 byte-compiled)
70+
"Composite Document File V2 Document",
71+
"Windows Embedded CE binary image",
6472
)
6573

6674

0 commit comments

Comments
 (0)