Address more extraction edge cases; improve naming and consistency#733
Address more extraction edge cases; improve naming and consistency#733stevebeattie merged 14 commits intochainguard-dev:mainfrom
Conversation
ac5f66d to
b815c32
Compare
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
0009e80 to
0acf3ec
Compare
|
Uncovered a few more edge cases while working on this, namely:
|
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
FYI, this specific instance looks to be a bug in the wolfi package for calico, in the build here https://github.com/wolfi-dev/os/blob/4e9ab93174c17cae8e9defb012b4a731621be1af/calico-3.29.yaml#L206-L212 ; it's trying to replicate the gpl ebpf tarball build from the upstream build process at https://github.com/projectcalico/calico/blob/810091a6acf98ecc891c5bc3c07374a786d68743/node/Makefile#L286-L290 but it dropped the
Do you have specific examples? The only example I could find on a running Ubuntu system was Overall, though probably outside the scope of this PR, I'd like to see such files be flagged by malcontent, as they are pretty rarely legit, they probably are either:
Thanks. |
The example you found was the one I came across. TBD if there are others in other distros. I like the idea of flagging files for further review but the changes in this PR will at least avoid trying to extract archives that aren't valid. |
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
stevebeattie
left a comment
There was a problem hiding this comment.
Thanks for making these changes, this looks much better.
In mimicking the behavior of upstream calico creating a tarball of the felix bpf gpl components, a bug was duplicated that named the tarball with a .tar.gz extension but didn't actually compress the tarball. Upstream eventually fixed this as part of their conversion to using xz compression on the tarball in projectcalico/calico#9364 Fix in the calico builds by converting to use xz. Busybox's version of tar does not support the --use-compress-program=COMMAND option from gnu tar that upstream used, so instead use the -J option to get xz compression. (This also was tripping up malcontent scans, but a fix for that landed upstream in chainguard-dev/malcontent#733 ) Signed-off-by: Steve Beattie <steve.beattie@chainguard.dev>
This PR addresses another subtle bug when extracting
.tar.gzarchives that are actually tar archives as well as other invalid.gzfiles and also adds better symlink handling.I noticed this failure:
And inspected the file locally:
To fix this, I moved the
.gzcase statement below the.tar.gzand.tgzcase statement so that it wouldn't preemptExtractTarand also added a check to validate that a given file is actually a valid gzip file (and using the defaut tar reader if not).With these changes, the file extracts and scans correctly: