Skip to content

Mismatched env var name for FastSPDX #196

@SkytAsul

Description

@SkytAsul

I want to use FastSPDX in the "Merging SPDX files" step of the scan.sh script. This is because, when having a large amount of big SPDX 2.2 files, the default SPDX merging algorithm can take days to finish (see spdx/tools-python#818).

If you want to use FastSPDX for parsing SPDX 2.2 files, you have to enable the IGNORE_PARSING_ERRORS variable, which is a rather weird name:

if os.getenv('IGNORE_PARSING_ERRORS', 'false') == 'true':
use_fastspdx = True
verbose("spdx_merge: Using FastSPDX parser")

The variable is used further down, at

if os.getenv('IGNORE_PARSING_ERRORS', 'false') != 'true':
print(f"Error parsing SPDX file: {file} {e}")
print("Hint: set IGNORE_PARSING_ERRORS=true to ignore this error")
raise e
else:
print(f"Ignored: Error parsing SPDX file: {file} {e}")

Maybe using FastSPDX is necessary for being able to ignore parsing errors, but in this case it would be nice to have another environment variable (like USE_FAST_SPDX) so we can use FastSPDX without ignoring parsing errors.

EDIT: I found the reason for this behavior in ec1c60c. According to the commit message, the verification is what is taking time in spdx-tools, so if you want to ignore verification you can speed up the process by utilizing the fast algorithm. It makes sense, but having another env variable is still useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions