Skip to content

Commit 075c0b7

Browse files
committed
Add documentation to leverage PostgreSQL service
Signed-off-by: tdruez <[email protected]>
1 parent e2ff44b commit 075c0b7

File tree

1 file changed

+116
-4
lines changed

1 file changed

+116
-4
lines changed

docs/quickstart.rst

Lines changed: 116 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
QuickStart
44
==========
55

6-
Run a Scan (no installation required!)
7-
--------------------------------------
6+
Run a Local Directory Scan (no installation required!)
7+
------------------------------------------------------
88

99
The **fastest way** to get started and **scan a codebase** —
1010
**no installation needed** — is by using the latest
@@ -52,8 +52,120 @@ See the :ref:`RUN command <cli_run>` section for more details on this command.
5252
.. note::
5353
Not sure which pipeline to use? Check out :ref:`faq_which_pipeline`.
5454

55-
Next Step: Local Installation
56-
-----------------------------
55+
Run a Remote Package Scan
56+
-------------------------
57+
58+
Let's look at another example — this time scanning a **remote package archive** by
59+
providing its **download URL**:
60+
61+
.. code-block:: bash
62+
63+
docker run --rm \
64+
ghcr.io/aboutcode-org/scancode.io:latest \
65+
run scan_single_package https://github.com/aboutcode-org/python-inspector/archive/refs/tags/v0.14.4.zip \
66+
> results.json
67+
68+
Let's break down what's happening here:
69+
70+
- ``docker run --rm``
71+
Runs a temporary container that is automatically removed after the scan completes.
72+
73+
- ``ghcr.io/aboutcode-org/scancode.io:latest``
74+
Uses the latest ScanCode.io image from GitHub Container Registry.
75+
76+
- ``run scan_single_package <URL>``
77+
Executes the ``scan_single_package`` pipeline, automatically fetching and analyzing
78+
the package archive from the provided URL.
79+
80+
- ``> results.json``
81+
Writes the scan results to a local ``results.json`` file.
82+
83+
Notice that the ``-v "$(pwd)":/codedrop`` option is **not required** in this case
84+
because the input is downloaded directly from the provided URL, rather than coming
85+
from your local filesystem.
86+
87+
The result? A **complete scan of a remote package archive — no setup, one command!**
88+
89+
Use PostgreSQL for Better Performance
90+
-------------------------------------
91+
92+
By default, ScanCode.io uses a **temporary SQLite database** for simplicity.
93+
While this works well for quick scans, it has a few limitations — such as
94+
**no multiprocessing** and slower performance on large codebases.
95+
96+
For improved speed and scalability, you can run your pipelines using a
97+
**PostgreSQL database** instead.
98+
99+
Start a PostgreSQL Database Service
100+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
101+
102+
First, start a PostgreSQL container in the background:
103+
104+
.. code-block:: bash
105+
106+
docker run -d \
107+
--name scancodeio-run-db \
108+
-e POSTGRES_DB=scancodeio \
109+
-e POSTGRES_USER=scancodeio \
110+
-e POSTGRES_PASSWORD=scancodeio \
111+
-e POSTGRES_INITDB_ARGS="--encoding=UTF-8 --lc-collate=en_US.UTF-8 --lc-ctype=en_US.UTF-8" \
112+
-v scancodeio_pgdata:/var/lib/postgresql/data \
113+
-p 5432:5432 \
114+
postgres:17
115+
116+
This command starts a new PostgreSQL service named ``scancodeio-run-db`` and stores its
117+
data in a named Docker volume called ``scancodeio_pgdata``.
118+
119+
.. note::
120+
You can stop and remove the PostgreSQL service once you are done using:
121+
122+
.. code-block:: bash
123+
124+
docker rm -f scancodeio-run-db
125+
126+
.. tip::
127+
The named volume ``scancodeio_pgdata`` ensures that your database data
128+
**persists across runs**.
129+
You can remove it later with ``docker volume rm scancodeio_pgdata`` if needed.
130+
131+
Run a Docker Image Analysis Using PostgreSQL
132+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
133+
134+
Once PostgreSQL is running, you can start a ScanCode.io pipeline
135+
using the same Docker image, connecting it to the PostgreSQL database container:
136+
137+
.. code-block:: bash
138+
139+
docker run --rm \
140+
--network host \
141+
-e SCANCODEIO_NO_AUTO_DB=1 \
142+
ghcr.io/aboutcode-org/scancode.io:latest \
143+
run analyze_docker_image docker://alpine:3.22.1 \
144+
> results.json
145+
146+
Here’s what’s happening:
147+
148+
- ``--network host``
149+
Ensures the container can connect to the PostgreSQL service running on your host.
150+
151+
- ``-e SCANCODEIO_NO_AUTO_DB=1``
152+
Tells ScanCode.io **not** to create a temporary SQLite database, and instead use
153+
the configured PostgreSQL connection defined in its default settings.
154+
155+
- ``ghcr.io/aboutcode-org/scancode.io:latest``
156+
Uses the latest ScanCode.io image from GitHub Container Registry.
157+
158+
- ``run analyze_docker_image docker://alpine:3.22.1``
159+
Runs the ``analyze_docker_image`` pipeline, scanning the given Docker image.
160+
161+
- ``> results.json``
162+
Saves the scan results to a local ``results.json`` file.
163+
164+
The result? A **faster, multiprocessing-enabled scan** backed by PostgreSQL — ideal
165+
for large or complex analyses.
166+
167+
Next Step: Installation
168+
-----------------------
57169

58170
Install ScanCode.io, to **unlock all features**:
59171

0 commit comments

Comments
 (0)