Skip to content

Commit e028191

Browse files
committed
Merge branch 'main' into file-search
2 parents a2dd550 + eb8d4bb commit e028191

40 files changed

+793
-344
lines changed

CHANGELOG.rst

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,50 @@
11
Changelog
22
=========
33

4+
v35.4.1 (2025-10-24)
5+
--------------------
6+
7+
- Add ability to download all output results formats as a zipfile for a given project.
8+
https://github.com/aboutcode-org/scancode.io/issues/1880
9+
10+
- Add support for tagging inputs in the run management command
11+
Add ability to skip the SQLite auto db in combined_run
12+
Add documentation to leverage PostgreSQL service
13+
https://github.com/aboutcode-org/scancode.io/pull/1916
14+
15+
- Refine d2d pipeline for scala and kotlin.
16+
https://github.com/aboutcode-org/scancode.io/issues/1898
17+
18+
- Add utilities to create/init FederatedCode data repo.
19+
https://github.com/aboutcode-org/scancode.io/issues/1896
20+
21+
- Add a verify-project CLI management command.
22+
https://github.com/aboutcode-org/scancode.io/issues/1903
23+
24+
- Add support for multiple inputs in the run management command.
25+
https://github.com/aboutcode-org/scancode.io/issues/1916
26+
27+
- Add the django-htmx app to the stack.
28+
https://github.com/aboutcode-org/scancode.io/issues/1917
29+
30+
- Adjust the resource tree view table rendering.
31+
https://github.com/aboutcode-org/scancode.io/issues/1840
32+
33+
- Add ".." navigation option in table to navigate to parent resource.
34+
https://github.com/aboutcode-org/scancode.io/issues/1869
35+
36+
- Add ability to download all output results formats.
37+
https://github.com/aboutcode-org/scancode.io/issues/1880
38+
39+
- Update Java D2D Pipeline to Include Checksum Mapped Sources for Accurate Java Mapping.
40+
https://github.com/aboutcode-org/scancode.io/issues/1870
41+
42+
- Auto-detect pipeline from provided input.
43+
https://github.com/aboutcode-org/scancode.io/issues/1883
44+
45+
- Migrate SCA workflows verification to new verify-project management command.
46+
https://github.com/aboutcode-org/scancode.io/issues/1902
47+
448
v35.4.0 (2025-09-30)
549
--------------------
650

docs/command-line-interface.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -419,10 +419,11 @@ Displays status information about the ``PROJECT`` project.
419419

420420
.. _cli_output:
421421

422-
`$ scanpipe output --project PROJECT --format {json,csv,xlsx,spdx,cyclonedx,attribution}`
423-
-----------------------------------------------------------------------------------------
422+
`$ scanpipe output --project PROJECT --format {json,csv,xlsx,spdx,cyclonedx,attribution,...}`
423+
---------------------------------------------------------------------------------------------
424424

425-
Outputs the ``PROJECT`` results as JSON, XLSX, CSV, SPDX, CycloneDX, and Attribution.
425+
Outputs the ``PROJECT`` results as JSON, XLSX, CSV, SPDX, CycloneDX,
426+
ORT package-list.yml, and Attribution.
426427
The output files are created in the ``PROJECT`` :guilabel:`output/` directory.
427428

428429
Multiple formats can be provided at once::

docs/output-files.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -285,7 +285,6 @@ Additional sheets are included **only when relevant** (i.e., when data is availa
285285

286286
SPDX
287287
^^^^
288-
289288
ScanCode.io can generate Software Bill of Materials (SBOM) in the **SPDX** format,
290289
which is an open standard for communicating software component information.
291290
SPDX is widely used for license compliance, security analysis, and software supply
@@ -309,7 +308,6 @@ The SPDX output includes:
309308

310309
CycloneDX
311310
^^^^^^^^^
312-
313311
ScanCode.io can generate **CycloneDX** SBOMs, a lightweight standard designed for
314312
security and dependency management. CycloneDX is optimized for vulnerability analysis
315313
and software supply chain risk assessment.

docs/quickstart.rst

Lines changed: 116 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
QuickStart
44
==========
55

6-
Run a Scan (no installation required!)
7-
--------------------------------------
6+
Run a Local Directory Scan (no installation required!)
7+
------------------------------------------------------
88

99
The **fastest way** to get started and **scan a codebase** —
1010
**no installation needed** — is by using the latest
@@ -52,8 +52,120 @@ See the :ref:`RUN command <cli_run>` section for more details on this command.
5252
.. note::
5353
Not sure which pipeline to use? Check out :ref:`faq_which_pipeline`.
5454

55-
Next Step: Local Installation
56-
-----------------------------
55+
Run a Remote Package Scan
56+
-------------------------
57+
58+
Let's look at another example — this time scanning a **remote package archive** by
59+
providing its **download URL**:
60+
61+
.. code-block:: bash
62+
63+
docker run --rm \
64+
ghcr.io/aboutcode-org/scancode.io:latest \
65+
run scan_single_package https://github.com/aboutcode-org/python-inspector/archive/refs/tags/v0.14.4.zip \
66+
> results.json
67+
68+
Let's break down what's happening here:
69+
70+
- ``docker run --rm``
71+
Runs a temporary container that is automatically removed after the scan completes.
72+
73+
- ``ghcr.io/aboutcode-org/scancode.io:latest``
74+
Uses the latest ScanCode.io image from GitHub Container Registry.
75+
76+
- ``run scan_single_package <URL>``
77+
Executes the ``scan_single_package`` pipeline, automatically fetching and analyzing
78+
the package archive from the provided URL.
79+
80+
- ``> results.json``
81+
Writes the scan results to a local ``results.json`` file.
82+
83+
Notice that the ``-v "$(pwd)":/codedrop`` option is **not required** in this case
84+
because the input is downloaded directly from the provided URL, rather than coming
85+
from your local filesystem.
86+
87+
The result? A **complete scan of a remote package archive — no setup, one command!**
88+
89+
Use PostgreSQL for Better Performance
90+
-------------------------------------
91+
92+
By default, ScanCode.io uses a **temporary SQLite database** for simplicity.
93+
While this works well for quick scans, it has a few limitations — such as
94+
**no multiprocessing** and slower performance on large codebases.
95+
96+
For improved speed and scalability, you can run your pipelines using a
97+
**PostgreSQL database** instead.
98+
99+
Start a PostgreSQL Database Service
100+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
101+
102+
First, start a PostgreSQL container in the background:
103+
104+
.. code-block:: bash
105+
106+
docker run -d \
107+
--name scancodeio-run-db \
108+
-e POSTGRES_DB=scancodeio \
109+
-e POSTGRES_USER=scancodeio \
110+
-e POSTGRES_PASSWORD=scancodeio \
111+
-e POSTGRES_INITDB_ARGS="--encoding=UTF-8 --lc-collate=en_US.UTF-8 --lc-ctype=en_US.UTF-8" \
112+
-v scancodeio_pgdata:/var/lib/postgresql/data \
113+
-p 5432:5432 \
114+
postgres:17
115+
116+
This command starts a new PostgreSQL service named ``scancodeio-run-db`` and stores its
117+
data in a named Docker volume called ``scancodeio_pgdata``.
118+
119+
.. note::
120+
You can stop and remove the PostgreSQL service once you are done using:
121+
122+
.. code-block:: bash
123+
124+
docker rm -f scancodeio-run-db
125+
126+
.. tip::
127+
The named volume ``scancodeio_pgdata`` ensures that your database data
128+
**persists across runs**.
129+
You can remove it later with ``docker volume rm scancodeio_pgdata`` if needed.
130+
131+
Run a Docker Image Analysis Using PostgreSQL
132+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
133+
134+
Once PostgreSQL is running, you can start a ScanCode.io pipeline
135+
using the same Docker image, connecting it to the PostgreSQL database container:
136+
137+
.. code-block:: bash
138+
139+
docker run --rm \
140+
--network host \
141+
-e SCANCODEIO_NO_AUTO_DB=1 \
142+
ghcr.io/aboutcode-org/scancode.io:latest \
143+
run analyze_docker_image docker://alpine:3.22.1 \
144+
> results.json
145+
146+
Here’s what’s happening:
147+
148+
- ``--network host``
149+
Ensures the container can connect to the PostgreSQL service running on your host.
150+
151+
- ``-e SCANCODEIO_NO_AUTO_DB=1``
152+
Tells ScanCode.io **not** to create a temporary SQLite database, and instead use
153+
the configured PostgreSQL connection defined in its default settings.
154+
155+
- ``ghcr.io/aboutcode-org/scancode.io:latest``
156+
Uses the latest ScanCode.io image from GitHub Container Registry.
157+
158+
- ``run analyze_docker_image docker://alpine:3.22.1``
159+
Runs the ``analyze_docker_image`` pipeline, scanning the given Docker image.
160+
161+
- ``> results.json``
162+
Saves the scan results to a local ``results.json`` file.
163+
164+
The result? A **faster, multiprocessing-enabled scan** backed by PostgreSQL — ideal
165+
for large or complex analyses.
166+
167+
Next Step: Installation
168+
-----------------------
57169

58170
Install ScanCode.io, to **unlock all features**:
59171

docs/rest-api.rst

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -694,10 +694,16 @@ Finally, use this action to download the project results in the provided
694694
``output_format`` as an attachment file.
695695

696696
Data:
697-
- ``output_format``: ``json``, ``xlsx``, ``spdx``, ``cyclonedx``, ``attribution``
697+
- ``output_format``: ``json``, ``xlsx``, ``spdx``, ``cyclonedx``, ``attribution``,
698+
``all_formats``, ``all_outputs``
698699

699700
``GET /api/projects/d4ed9405-5568-45ad-99f6-782a9b82d1d2/results_download/?output_format=cyclonedx``
700701

702+
.. note::
703+
Use ``all_formats`` to generate a zip file containing all output formats for a
704+
project, while ``all_outputs`` can be used to obtain a zip file of all existing
705+
output files for that project.
706+
701707
.. tip::
702708
Refer to :ref:`output_files` to learn more about the available output formats.
703709

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "scancodeio"
7-
version = "35.4.0"
7+
version = "35.4.1"
88
description = "Automate software composition analysis pipelines"
99
readme = "README.rst"
1010
requires-python = ">=3.10,<3.14"
@@ -44,6 +44,7 @@ dependencies = [
4444
"django-filter==25.1",
4545
"djangorestframework==3.16.1",
4646
"django-taggit==6.1.0",
47+
"django-htmx==1.26.0",
4748
# Database
4849
"psycopg[binary]==3.2.10",
4950
# wait_for_database Django management command

scancodeio/__init__.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828

2929
import git
3030

31-
VERSION = "35.4.0"
31+
VERSION = "35.4.1"
3232

3333
PROJECT_DIR = Path(__file__).resolve().parent
3434
ROOT_DIR = PROJECT_DIR.parent
@@ -106,6 +106,9 @@ def combined_run():
106106
configuration.
107107
It combines the creation, execution, and result retrieval of the project into a
108108
single process.
109+
110+
Set SCANCODEIO_NO_AUTO_DB=1 to use the database configuration from the settings
111+
instead of SQLite.
109112
"""
110113
from django.core.checks.security.base import SECRET_KEY_INSECURE_PREFIX
111114
from django.core.management import execute_from_command_line
@@ -114,10 +117,12 @@ def combined_run():
114117
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "scancodeio.settings")
115118
secret_key = SECRET_KEY_INSECURE_PREFIX + get_random_secret_key()
116119
os.environ.setdefault("SECRET_KEY", secret_key)
117-
os.environ.setdefault("SCANCODEIO_DB_ENGINE", "django.db.backends.sqlite3")
118-
os.environ.setdefault("SCANCODEIO_DB_NAME", "scancodeio.sqlite3")
119-
# Disable multiprocessing
120-
os.environ.setdefault("SCANCODEIO_PROCESSES", "0")
120+
121+
# Default to SQLite unless SCANCODEIO_NO_AUTO_DB is provided
122+
if not os.getenv("SCANCODEIO_NO_AUTO_DB"):
123+
os.environ.setdefault("SCANCODEIO_DB_ENGINE", "django.db.backends.sqlite3")
124+
os.environ.setdefault("SCANCODEIO_DB_NAME", "scancodeio.sqlite3")
125+
os.environ.setdefault("SCANCODEIO_PROCESSES", "0") # Disable multiprocessing
121126

122127
sys.argv.insert(1, "run")
123128
execute_from_command_line(sys.argv)

scancodeio/settings.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,7 @@
198198
"django_rq",
199199
"django_probes",
200200
"taggit",
201+
"django_htmx",
201202
]
202203

203204
MIDDLEWARE = [
@@ -208,6 +209,7 @@
208209
"django.contrib.auth.middleware.AuthenticationMiddleware",
209210
"django.contrib.messages.middleware.MessageMiddleware",
210211
"django.middleware.clickjacking.XFrameOptionsMiddleware",
212+
"django_htmx.middleware.HtmxMiddleware",
211213
"scancodeio.middleware.TimezoneMiddleware",
212214
]
213215

scancodeio/static/htmx-2.0.4.min.js

Lines changed: 0 additions & 1 deletion
This file was deleted.

scancodeio/static/htmx-2.0.4.min.js.ABOUT

Lines changed: 0 additions & 13 deletions
This file was deleted.

0 commit comments

Comments
 (0)