|
| 1 | +VulnerableCode: On-demand live evaluation of packages and Integration with VulnTotal and its browser extension |
| 2 | +============================================================================================================== |
| 3 | + |
| 4 | +Organization - `AboutCode <https://www.aboutcode.org>`_ |
| 5 | +----------------------------------------------------------- |
| 6 | +| **Michael Ehab Mikhail** |
| 7 | +| GitHub: `michaelehab <https://github.com/michaelehab>`_ |
| 8 | +| LinkedIn: `@michaelehab16 <https://www.linkedin.com/in/michaelehab16/>`_ |
| 9 | +| Project: `VulnerableCode |
| 10 | + <https://github.com/aboutcode-org/vulnerablecode>`_ |
| 11 | +| Official GSoC project page: `Project Link |
| 12 | + <https://summerofcode.withgoogle.com/programs/2025/projects/uF0kzMAg>`_ |
| 13 | +| GSoC Proposal: `Proposal Link |
| 14 | + <https://docs.google.com/document/d/1Tkk4MoPWXFj9r_U5cp3E4AhJW6QlHxTElyzpII_f4LM/edit?usp=sharing>`_ |
| 15 | +
|
| 16 | +Overview |
| 17 | +-------- |
| 18 | + |
| 19 | +VulnerableCode traditionally relied on **batch importers** to fetch and store |
| 20 | +all advisories from a source at once. While effective for building complete |
| 21 | +databases, batch importers are slow and resource-heavy for developers who only |
| 22 | +need vulnerability data for a **single package**. |
| 23 | + |
| 24 | +This project introduces **live importers**, a new class of importers that |
| 25 | +operate in a *package-first* mode. Instead of pulling all advisories, they run |
| 26 | +against a single PackageURL (PURL), returning only the advisories affecting that |
| 27 | +package. This makes vulnerability evaluation **faster, more efficient, and more |
| 28 | +personalized**, since the database is gradually filled with only the advisories |
| 29 | +that matter to each user. |
| 30 | + |
| 31 | +To support this, I added: |
| 32 | + |
| 33 | +* A new **LIVE_IMPORTERS_REGISTRY** that tracks available live importers. |
| 34 | +* A new **API endpoint** that accepts a PURL and runs all compatible live |
| 35 | + importers in parallel (unless the ``no_threading`` flag is set). |
| 36 | +* Integration with **VulnTotal** and its **browser extension**, enabling users |
| 37 | + to evaluate packages in real-time through a seamless interface. |
| 38 | + |
| 39 | +This work bridges the gap between **batch-first databases** and |
| 40 | +**package-first queries**, improving VulnerableCode's flexibility and enabling |
| 41 | +better integration with developer workflows. |
| 42 | + |
| 43 | +.. note:: |
| 44 | + A PURL (Package URL) is a universal way to identify and locate software |
| 45 | + packages. `More on PURL <https://github.com/package-url>`_ |
| 46 | + |
| 47 | + |
| 48 | +Project Design and Architecture |
| 49 | +------------------------------- |
| 50 | + |
| 51 | +The new live importers system builds on existing batch importers, while |
| 52 | +introducing a parallel registry and execution model for package-first runs. |
| 53 | + |
| 54 | +Importer Registries |
| 55 | +^^^^^^^^^^^^^^^^^^^ |
| 56 | + |
| 57 | +* ``IMPORTERS_REGISTRY`` continues to hold batch importers (V1/V2). |
| 58 | +* ``LIVE_IMPORTERS_REGISTRY`` holds live importers. |
| 59 | + |
| 60 | +Each live importer: |
| 61 | + |
| 62 | +* Inherits from its batch importer (when logic can be reused), or directly |
| 63 | + from ``VulnerableCodeBaseImporterPipelineV2`` when a separate |
| 64 | + implementation is needed. |
| 65 | +* Declares a ``supported_types`` array, defining compatible package |
| 66 | + ecosystems (``"pypi"``, ``"npm"``, ``"maven"``, ``"generic"``, etc). |
| 67 | +* Implements a package-first ``collect_advisories()`` method, which |
| 68 | + restricts results to advisories relevant to the given PURL. |
| 69 | + |
| 70 | +.. figure:: https://private-user-images.githubusercontent.com/29122581/480716687-1ffa16ba-fbce-41bd-b71a-674620a2fec3.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MTMxNDQsIm5iZiI6MTc1NTgxMjg0NCwicGF0aCI6Ii8yOTEyMjU4MS80ODA3MTY2ODctMWZmYTE2YmEtZmJjZS00MWJkLWI3MWEtNjc0NjIwYTJmZWMzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwODIxVDIxNDcyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWI4MWQ2ZmY4MzkxY2MzOTYwNTI3MTViZThiZTk1Yzc0Y2Y0Y2E3YWNhY2Q5OTU5OTE5MTgxOGI5NGM1OTlkODcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.Mt9-TVHRNDduOHHMnLuhr-vzKSxrbfmAz1JtWfksFA4 |
| 71 | + :alt: Class architecture of importers registries |
| 72 | + :align: center |
| 73 | + :width: 70% |
| 74 | + |
| 75 | + Class architecture showing relationship between ``IMPORTERS_REGISTRY`` and |
| 76 | + ``LIVE_IMPORTERS_REGISTRY``. |
| 77 | + |
| 78 | +API Endpoint |
| 79 | +^^^^^^^^^^^^ |
| 80 | + |
| 81 | +The new API endpoint is responsible for handling live evaluation requests. |
| 82 | + |
| 83 | +* Input: |
| 84 | + * ``purl_string`` (required) |
| 85 | + * ``no_threading`` (optional, default ``false``) |
| 86 | +* Execution: |
| 87 | + * Checks ``LIVE_IMPORTERS_REGISTRY`` for importers whose ``supported_types`` match the PURL. |
| 88 | + * Runs compatible importers in parallel unless ``no_threading`` is true. |
| 89 | +* Output: |
| 90 | + * A set of advisories affecting the requested PURL, imported directly into the database and returned as JSON. |
| 91 | + |
| 92 | +.. figure:: https://private-user-images.githubusercontent.com/29122581/480716572-a29dbafc-3290-49dd-8cca-20afd0291d68.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MTMxNDQsIm5iZiI6MTc1NTgxMjg0NCwicGF0aCI6Ii8yOTEyMjU4MS80ODA3MTY1NzItYTI5ZGJhZmMtMzI5MC00OWRkLThjY2EtMjBhZmQwMjkxZDY4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwODIxVDIxNDcyNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWE1NDczZGNhN2JhYjA2YmQzNWMwNGUzMzI1NmY5MTc3YzJjZmM4YTk2MWE2MjAwYWE0YmQzYWU0YmJiNGI5MzAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.DWqFUO5SnUVedEvTL3i5i-eI6tGHNaGKWTXT9yQt9Cs |
| 93 | + :alt: Live Importers API request flow |
| 94 | + :align: center |
| 95 | + :width: 70% |
| 96 | + |
| 97 | + Flow of API endpoint: selecting compatible live importers and executing |
| 98 | + them in parallel. |
| 99 | + |
| 100 | +Integration with VulnTotal |
| 101 | +^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 102 | + |
| 103 | +The new API was integrated into VulnTotal as an optional datasource: |
| 104 | + |
| 105 | +* VulnTotal now checks the local environment for |
| 106 | + ``VCIO_HOST``, ``VCIO_PORT``, and ``ENABLE_LIVE_EVAL`` flags in ``.env``. |
| 107 | +* If enabled, VulnTotal queries VulnerableCode in package-first mode. |
| 108 | +* This allows VulnTotal to use both its proprietary datasources **and** |
| 109 | + the user's gradually built local database, improving coverage and |
| 110 | + personalization. |
| 111 | + |
| 112 | +Integration with VulnTotal Browser Extension |
| 113 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 114 | + |
| 115 | +The VulnTotal browser extension was updated to support live importers: |
| 116 | + |
| 117 | +* Users can enable the "Local VulnerableCode" datasource and live evaluation option. |
| 118 | +* When enabled, package lookups are forwarded to the new API, retrieving |
| 119 | + advisories in real-time. |
| 120 | +* This reduces setup effort—developers can get live vulnerability checks |
| 121 | + directly in their browser, provided they have a local VC instance. |
| 122 | + |
| 123 | +.. figure:: https://private-user-images.githubusercontent.com/29122581/480717461-29806bc6-faf5-48c9-8632-608c23d96e83.gif?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MTMzNTksIm5iZiI6MTc1NTgxMzA1OSwicGF0aCI6Ii8yOTEyMjU4MS80ODA3MTc0NjEtMjk4MDZiYzYtZmFmNS00OGM5LTg2MzItNjA4YzIzZDk2ZTgzLmdpZj9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwODIxVDIxNTA1OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWUwOTRkNWI1OTViNzYxODM4MjAyYTBjYTdmY2QyMzQ1Mzg2MTVmM2M5N2Q0M2I1MDQwMGRiNWJjZDllNmRjODQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.lmRcQBxwP1kbhcQ8Siq6vvm1GrBfd_BhkIbIte2NuYs |
| 124 | + :alt: Live evaluation demo in VulnTotal browser extension |
| 125 | + :align: center |
| 126 | + :width: 70% |
| 127 | + |
| 128 | + VulnTotal and its browser extension consuming the new live evaluation API. |
| 129 | + |
| 130 | +Linked Pull Requests |
| 131 | +-------------------- |
| 132 | + |
| 133 | +.. list-table:: |
| 134 | + :widths: 10 40 20 |
| 135 | + :header-rows: 1 |
| 136 | + |
| 137 | + * - Sr. no |
| 138 | + - Name |
| 139 | + - Link |
| 140 | + * - 1 |
| 141 | + - Add Live Evaluation API endpoint and PyPa live pipeline importer |
| 142 | + - `aboutcode-org/vulnerablecode#1969 |
| 143 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1969>`_ |
| 144 | + * - 2 |
| 145 | + - Add Gitlab Live V2 Importer |
| 146 | + - `aboutcode-org/vulnerablecode#1910 |
| 147 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1910>`_ |
| 148 | + * - 3 |
| 149 | + - Add Curl Live Importer V2 |
| 150 | + - `aboutcode-org/vulnerablecode#1923 |
| 151 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1923>`_ |
| 152 | + * - 4 |
| 153 | + - Add Elixir Security Live V2 Importer |
| 154 | + - `aboutcode-org/vulnerablecode#1935 |
| 155 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1935>`_ |
| 156 | + * - 5 |
| 157 | + - Add NPM Live Importer V2 |
| 158 | + - `aboutcode-org/vulnerablecode#1941 |
| 159 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1941>`_ |
| 160 | + * - 6 |
| 161 | + - Add GitHub OSV Live V2 Importer Pipeline |
| 162 | + - `aboutcode-org/vulnerablecode#1977 |
| 163 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1977>`_ |
| 164 | + * - 7 |
| 165 | + - Add Postgres Live V2 Importer Pipeline |
| 166 | + - `aboutcode-org/vulnerablecode#1982 |
| 167 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1982>`_ |
| 168 | + * - 8 |
| 169 | + - Add PySec Live V2 Importer Pipeline |
| 170 | + - `aboutcode-org/vulnerablecode#1983 |
| 171 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1983>`_ |
| 172 | + * - 9 |
| 173 | + - Add Local VulnerableCode Datasource in VulnTotal and allow live evaluation |
| 174 | + - `aboutcode-org/vulnerablecode#1985 |
| 175 | + <https://github.com/aboutcode-org/vulnerablecode/pull/1985>`_ |
| 176 | + * - 10 |
| 177 | + - Integrate Local VulnerableCode datasource and live evaluation |
| 178 | + - `aboutcode-org/vulntotal-extension#17 |
| 179 | + <https://github.com/aboutcode-org/vulntotal-extension/pull/17>`_ |
| 180 | + |
| 181 | + |
| 182 | +Closing Thoughts |
| 183 | +------------------- |
| 184 | + |
| 185 | +This project was an exciting step forward from my 2024 GSoC work. By moving |
| 186 | +from batch importers to package-first live importers, We enabled a faster, |
| 187 | +more personalized, and more flexible way of building vulnerability databases. |
| 188 | + |
| 189 | +I especially enjoyed designing the **registry + API architecture** and |
| 190 | +discussing it with mentors and integrating it seamlessly across **VulnerableCode, VulnTotal, and the |
| 191 | +browser extension**. This work lays the foundation for even richer |
| 192 | +interactivity in the ecosystem and brings vulnerability evaluation closer |
| 193 | +to developers' workflows. |
| 194 | + |
| 195 | +I appreciated the weekly status calls and the feedback I received from my |
| 196 | +mentors and the amazing team. They were really helpful and supportive. - |
| 197 | +`Philippe Ombredanne <https://github.com/pombredanne>`_ - `Ayan Sinha |
| 198 | +Mahapatra <https://github.com/AyanSinhaMahapatra>`_ - `Tushar Goel |
| 199 | +<https://github.com/TG1999>`_ - `Keshav Priyadarshi |
| 200 | +<https://github.com/keshav-space>`_ |
0 commit comments