-
Notifications
You must be signed in to change notification settings - Fork 17
Description
I want to use FastSPDX in the "Merging SPDX files" step of the scan.sh script. This is because, when having a large amount of big SPDX 2.2 files, the default SPDX merging algorithm can take days to finish (see spdx/tools-python#818).
If you want to use FastSPDX for parsing SPDX 2.2 files, you have to enable the IGNORE_PARSING_ERRORS variable, which is a rather weird name:
vulnscout/src/bin/spdx_merge.py
Lines 28 to 30 in 0c50220
| if os.getenv('IGNORE_PARSING_ERRORS', 'false') == 'true': | |
| use_fastspdx = True | |
| verbose("spdx_merge: Using FastSPDX parser") |
The variable is used further down, at
vulnscout/src/bin/spdx_merge.py
Lines 50 to 55 in 0c50220
| if os.getenv('IGNORE_PARSING_ERRORS', 'false') != 'true': | |
| print(f"Error parsing SPDX file: {file} {e}") | |
| print("Hint: set IGNORE_PARSING_ERRORS=true to ignore this error") | |
| raise e | |
| else: | |
| print(f"Ignored: Error parsing SPDX file: {file} {e}") |
Maybe using FastSPDX is necessary for being able to ignore parsing errors, but in this case it would be nice to have another environment variable (like USE_FAST_SPDX) so we can use FastSPDX without ignoring parsing errors.
EDIT: I found the reason for this behavior in ec1c60c. According to the commit message, the verification is what is taking time in spdx-tools, so if you want to ignore verification you can speed up the process by utilizing the fast algorithm. It makes sense, but having another env variable is still useful.