@@ -8,6 +8,7 @@ This repo consists of four main tools:
88- MineCode that contains utilities to mine package repositories
99- MatchCode that contains utilities to index package metadata and resources for
1010 matching
11+ - MatchCode.io that provides package matching functionalities for codebases
1112- ClearCode that contains utilities to mine Clearlydefined for package data
1213
1314These are designed to be used first for reference such that one can query for
@@ -39,6 +40,7 @@ Once the prerequisites have been installed, set up PurlDB with the following com
3940 make dev
4041 make envfile
4142 make postgres
43+ make postgres_matchcodeio
4244
4345Once PurlDB and the database has been set up, run tests to ensure functionality:
4446::
@@ -53,6 +55,11 @@ Start the PurlDB server by running:
5355
5456 make run
5557
58+ Start the MatchCode.io server by running:
59+ ::
60+
61+ make run_matchcodeio
62+
5663To start visiting upstream package repositories for package metadata:
5764::
5865
@@ -69,33 +76,13 @@ Populating Package Resource Data
6976The Resources of Packages can be collected using the scan queue. By default, a
7077scan request will be created for each mapped Package.
7178
72- The following environment variables will have to be set for the scan queue
73- commands to work:
79+ Given that you have access to a ScanCode.io instance, the following environment
80+ variables will have to be set for the scan queue commands to work:
7481::
7582
7683 SCANCODEIO_URL=<ScanCode.io API URL>
7784 SCANCODEIO_API_KEY=<ScanCode.io API Key>
7885
79- ``matchcode-toolkit `` will also have to be installed in the same environment as
80- ScanCode.io. If running ScanCode.io in a virtual environment from a git
81- checkout, you can install ``matchcode-toolkit `` in editable mode:
82- ::
83-
84- pip install -e <Path to purldb/matchcode-toolkit>
85-
86- Otherwise, you can create a wheel from ``matchcode-toolkit `` and install it in
87- the ScanCode.io virutal environment or modify the ScanCode.io Dockerfile to
88- install the ``matchcode-toolkit `` wheel.
89-
90- To build the ``matchcode-toolkit `` wheel:
91- ::
92-
93- # From the matchcode-toolkit directory
94- python setup.py bdist_wheel
95-
96- The wheel ``matchcode_toolkit-0.0.1-py3-none-any.whl `` will be created in the
97- ``matchcode-toolkit/dist/ `` directory.
98-
9986The scan queue is run using two commands:
10087::
10188
@@ -136,8 +123,8 @@ matching indices from the collected Package data:
136123 make index_packages
137124
138125
139- API Endpoints
140- -------------
126+ PurlDB API Endpoints
127+ --------------------
141128
142129* ``api/packages ``
143130
@@ -172,6 +159,51 @@ API Endpoints
172159 * Used to check the SHA1 values of archives from a scan to determine if they are known Packages
173160
174161
162+ MatchCode.io
163+ ------------
164+
165+ MatchCode.io is a Django app, based off of ScanCode.io, that exposes one API
166+ endpoint, ``api/matching ``, which takes a ScanCode.io codebase scan, and
167+ performs Package matching on it.
168+
169+ Currently, it performs three matching steps:
170+
171+ * Match codebase resources against the Packages in the PackageDB
172+ * Match codebase resources against the Resources in the PackageDB
173+ * Match codebase directories against the directory matching indices of
174+ MatchCode
175+
176+
177+ MatchCode.io API Endpoints
178+ --------------------------
179+
180+ * ``api/matching ``
181+
182+ * Performs Package matching on an uploaded ScanCode.io scan
183+ * Intended to be used with the ``match_to_purldb `` pipeline in ScanCode.io
184+
185+
186+ Docker Setup for Local Development and Testing
187+ ----------------------------------------------
188+
189+ PurlDB and MatchCode.io are two separate Django apps. In order to run both of
190+ these Django apps on the same host, we need to use Traefik.
191+
192+ Traefik is an edge router that receives requests and finds out which services
193+ are responsible for handling them. In the docker-compose.yml files for PurlDB
194+ and MatchCode.io, we have made these two services part of the same Docker
195+ network and set up the routes for each service.
196+
197+ All requests to the host go to the PurlDB service, but requests that go to the
198+ ``api/matching `` endpoint are routed to the MatchCode.io service.
199+
200+ To run PurlDB and Matchcode.io with Docker:
201+ ::
202+
203+ docker compose -f docker-compose_traefik.yml up -d
204+ docker compose -f docker-compose_purldb.yml up -d
205+ docker compose -f docker-compose_matchcodeio.yml up -d
206+
175207Funding
176208-------
177209
0 commit comments