Skip to content

Commit e94108c

Browse files
committed
Add GSoC 2025 Project Report
Signed-off-by: Michael Ehab Mikhail <[email protected]>
1 parent 0ca1881 commit e94108c

File tree

2 files changed

+208
-0
lines changed

2 files changed

+208
-0
lines changed

docs/source/archive/gsoc-toc.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@ designed to encourage university student participation in open source
88
software development. It was started by Google in 2005. More about GSoC -
99
`<https://summerofcode.withgoogle.com/about/>`_
1010

11+
GSoC 2025
12+
---------
13+
14+
.. toctree::
15+
:maxdepth: 2
16+
17+
gsoc/reports/2025/vulnerablecode_michael
18+
1119
GSoC 2024
1220
---------
1321

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
VulnerableCode: On-demand live evaluation of packages and Integration with VulnTotal and its browser extension
2+
==============================================================================================================
3+
4+
Organization - `AboutCode <https://www.aboutcode.org>`_
5+
-----------------------------------------------------------
6+
| **Michael Ehab Mikhail**
7+
| GitHub: `michaelehab <https://github.com/michaelehab>`_
8+
| LinkedIn: `@michaelehab16 <https://www.linkedin.com/in/michaelehab16/>`_
9+
| Project: `VulnerableCode
10+
<https://github.com/aboutcode-org/vulnerablecode>`_
11+
| Official GSoC project page: `Project Link
12+
<https://summerofcode.withgoogle.com/programs/2025/projects/uF0kzMAg>`_
13+
| GSoC Proposal: `Proposal Link
14+
<https://docs.google.com/document/d/1Tkk4MoPWXFj9r_U5cp3E4AhJW6QlHxTElyzpII_f4LM/edit?usp=sharing>`_
15+
16+
Overview
17+
--------
18+
19+
VulnerableCode traditionally relied on **batch importers** to fetch and store
20+
all advisories from a source at once. While effective for building complete
21+
databases, batch importers are slow and resource-heavy for developers who only
22+
need vulnerability data for a **single package**.
23+
24+
This project introduces **live importers**, a new class of importers that
25+
operate in a *package-first* mode. Instead of pulling all advisories, they run
26+
against a single PackageURL (PURL), returning only the advisories affecting that
27+
package. This makes vulnerability evaluation **faster, more efficient, and more
28+
personalized**, since the database is gradually filled with only the advisories
29+
that matter to each user.
30+
31+
To support this, I added:
32+
33+
* A new **LIVE_IMPORTERS_REGISTRY** that tracks available live importers.
34+
* A new **API endpoint** that accepts a PURL and runs all compatible live
35+
importers in parallel (unless the ``no_threading`` flag is set).
36+
* Integration with **VulnTotal** and its **browser extension**, enabling users
37+
to evaluate packages in real-time through a seamless interface.
38+
39+
This work bridges the gap between **batch-first databases** and
40+
**package-first queries**, improving VulnerableCode's flexibility and enabling
41+
better integration with developer workflows.
42+
43+
.. note::
44+
A PURL (Package URL) is a universal way to identify and locate software
45+
packages. `More on PURL <https://github.com/package-url>`_
46+
47+
48+
Project Design and Architecture
49+
-------------------------------
50+
51+
The new live importers system builds on existing batch importers, while
52+
introducing a parallel registry and execution model for package-first runs.
53+
54+
Importer Registries
55+
^^^^^^^^^^^^^^^^^^^
56+
57+
* ``IMPORTERS_REGISTRY`` continues to hold batch importers (V1/V2).
58+
* ``LIVE_IMPORTERS_REGISTRY`` holds live importers.
59+
60+
Each live importer:
61+
62+
* Inherits from its batch importer (when logic can be reused), or directly
63+
from ``VulnerableCodeBaseImporterPipelineV2`` when a separate
64+
implementation is needed.
65+
* Declares a ``supported_types`` array, defining compatible package
66+
ecosystems (``"pypi"``, ``"npm"``, ``"maven"``, ``"generic"``, etc).
67+
* Implements a package-first ``collect_advisories()`` method, which
68+
restricts results to advisories relevant to the given PURL.
69+
70+
.. figure:: https://private-user-images.githubusercontent.com/29122581/480716687-1ffa16ba-fbce-41bd-b71a-674620a2fec3.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MTMxNDQsIm5iZiI6MTc1NTgxMjg0NCwicGF0aCI6Ii8yOTEyMjU4MS80ODA3MTY2ODctMWZmYTE2YmEtZmJjZS00MWJkLWI3MWEtNjc0NjIwYTJmZWMzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwODIxVDIxNDcyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWI4MWQ2ZmY4MzkxY2MzOTYwNTI3MTViZThiZTk1Yzc0Y2Y0Y2E3YWNhY2Q5OTU5OTE5MTgxOGI5NGM1OTlkODcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Mt9-TVHRNDduOHHMnLuhr-vzKSxrbfmAz1JtWfksFA4
71+
:alt: Class architecture of importers registries
72+
:align: center
73+
:width: 70%
74+
75+
Class architecture showing relationship between ``IMPORTERS_REGISTRY`` and
76+
``LIVE_IMPORTERS_REGISTRY``.
77+
78+
API Endpoint
79+
^^^^^^^^^^^^
80+
81+
The new API endpoint is responsible for handling live evaluation requests.
82+
83+
* Input:
84+
* ``purl_string`` (required)
85+
* ``no_threading`` (optional, default ``false``)
86+
* Execution:
87+
* Checks ``LIVE_IMPORTERS_REGISTRY`` for importers whose ``supported_types`` match the PURL.
88+
* Runs compatible importers in parallel unless ``no_threading`` is true.
89+
* Output:
90+
* A set of advisories affecting the requested PURL, imported directly into the database and returned as JSON.
91+
92+
.. figure:: https://private-user-images.githubusercontent.com/29122581/480716572-a29dbafc-3290-49dd-8cca-20afd0291d68.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MTMxNDQsIm5iZiI6MTc1NTgxMjg0NCwicGF0aCI6Ii8yOTEyMjU4MS80ODA3MTY1NzItYTI5ZGJhZmMtMzI5MC00OWRkLThjY2EtMjBhZmQwMjkxZDY4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwODIxVDIxNDcyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWE1NDczZGNhN2JhYjA2YmQzNWMwNGUzMzI1NmY5MTc3YzJjZmM4YTk2MWE2MjAwYWE0YmQzYWU0YmJiNGI5MzAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.DWqFUO5SnUVedEvTL3i5i-eI6tGHNaGKWTXT9yQt9Cs
93+
:alt: Live Importers API request flow
94+
:align: center
95+
:width: 70%
96+
97+
Flow of API endpoint: selecting compatible live importers and executing
98+
them in parallel.
99+
100+
Integration with VulnTotal
101+
^^^^^^^^^^^^^^^^^^^^^^^^^^
102+
103+
The new API was integrated into VulnTotal as an optional datasource:
104+
105+
* VulnTotal now checks the local environment for
106+
``VCIO_HOST``, ``VCIO_PORT``, and ``ENABLE_LIVE_EVAL`` flags in ``.env``.
107+
* If enabled, VulnTotal queries VulnerableCode in package-first mode.
108+
* This allows VulnTotal to use both its proprietary datasources **and**
109+
the user's gradually built local database, improving coverage and
110+
personalization.
111+
112+
Integration with VulnTotal Browser Extension
113+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
114+
115+
The VulnTotal browser extension was updated to support live importers:
116+
117+
* Users can enable the "Local VulnerableCode" datasource and live evaluation option.
118+
* When enabled, package lookups are forwarded to the new API, retrieving
119+
advisories in real-time.
120+
* This reduces setup effort—developers can get live vulnerability checks
121+
directly in their browser, provided they have a local VC instance.
122+
123+
.. figure:: https://private-user-images.githubusercontent.com/29122581/480717461-29806bc6-faf5-48c9-8632-608c23d96e83.gif?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MTMzNTksIm5iZiI6MTc1NTgxMzA1OSwicGF0aCI6Ii8yOTEyMjU4MS80ODA3MTc0NjEtMjk4MDZiYzYtZmFmNS00OGM5LTg2MzItNjA4YzIzZDk2ZTgzLmdpZj9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwODIxVDIxNTA1OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWUwOTRkNWI1OTViNzYxODM4MjAyYTBjYTdmY2QyMzQ1Mzg2MTVmM2M5N2Q0M2I1MDQwMGRiNWJjZDllNmRjODQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.lmRcQBxwP1kbhcQ8Siq6vvm1GrBfd_BhkIbIte2NuYs
124+
:alt: Live evaluation demo in VulnTotal browser extension
125+
:align: center
126+
:width: 70%
127+
128+
VulnTotal and its browser extension consuming the new live evaluation API.
129+
130+
Linked Pull Requests
131+
--------------------
132+
133+
.. list-table::
134+
:widths: 10 40 20
135+
:header-rows: 1
136+
137+
* - Sr. no
138+
- Name
139+
- Link
140+
* - 1
141+
- Add Live Evaluation API endpoint and PyPa live pipeline importer
142+
- `aboutcode-org/vulnerablecode#1969
143+
<https://github.com/aboutcode-org/vulnerablecode/pull/1969>`_
144+
* - 2
145+
- Add Gitlab Live V2 Importer
146+
- `aboutcode-org/vulnerablecode#1910
147+
<https://github.com/aboutcode-org/vulnerablecode/pull/1910>`_
148+
* - 3
149+
- Add Curl Live Importer V2
150+
- `aboutcode-org/vulnerablecode#1923
151+
<https://github.com/aboutcode-org/vulnerablecode/pull/1923>`_
152+
* - 4
153+
- Add Elixir Security Live V2 Importer
154+
- `aboutcode-org/vulnerablecode#1935
155+
<https://github.com/aboutcode-org/vulnerablecode/pull/1935>`_
156+
* - 5
157+
- Add NPM Live Importer V2
158+
- `aboutcode-org/vulnerablecode#1941
159+
<https://github.com/aboutcode-org/vulnerablecode/pull/1941>`_
160+
* - 6
161+
- Add GitHub OSV Live V2 Importer Pipeline
162+
- `aboutcode-org/vulnerablecode#1977
163+
<https://github.com/aboutcode-org/vulnerablecode/pull/1977>`_
164+
* - 7
165+
- Add Postgres Live V2 Importer Pipeline
166+
- `aboutcode-org/vulnerablecode#1982
167+
<https://github.com/aboutcode-org/vulnerablecode/pull/1982>`_
168+
* - 8
169+
- Add PySec Live V2 Importer Pipeline
170+
- `aboutcode-org/vulnerablecode#1983
171+
<https://github.com/aboutcode-org/vulnerablecode/pull/1983>`_
172+
* - 9
173+
- Add Local VulnerableCode Datasource in VulnTotal and allow live evaluation
174+
- `aboutcode-org/vulnerablecode#1985
175+
<https://github.com/aboutcode-org/vulnerablecode/pull/1985>`_
176+
* - 10
177+
- Integrate Local VulnerableCode datasource and live evaluation
178+
- `aboutcode-org/vulntotal-extension#17
179+
<https://github.com/aboutcode-org/vulntotal-extension/pull/17>`_
180+
181+
182+
Closing Thoughts
183+
-------------------
184+
185+
This project was an exciting step forward from my 2024 GSoC work. By moving
186+
from batch importers to package-first live importers, We enabled a faster,
187+
more personalized, and more flexible way of building vulnerability databases.
188+
189+
I especially enjoyed designing the **registry + API architecture** and
190+
discussing it with mentors and integrating it seamlessly across **VulnerableCode, VulnTotal, and the
191+
browser extension**. This work lays the foundation for even richer
192+
interactivity in the ecosystem and brings vulnerability evaluation closer
193+
to developers' workflows.
194+
195+
I appreciated the weekly status calls and the feedback I received from my
196+
mentors and the amazing team. They were really helpful and supportive. -
197+
`Philippe Ombredanne <https://github.com/pombredanne>`_ - `Ayan Sinha
198+
Mahapatra <https://github.com/AyanSinhaMahapatra>`_ - `Tushar Goel
199+
<https://github.com/TG1999>`_ - `Keshav Priyadarshi
200+
<https://github.com/keshav-space>`_

0 commit comments

Comments
 (0)