@@ -20,16 +20,16 @@ Pipeline
2020We use `aboutcode.pipeline <https://github.com/aboutcode-org/scancode.io/tree/main/aboutcode/pipeline >`_
2121for importing and improving data. At a very high level, a working pipeline contains classmethod
2222``steps `` that defines what steps to run and in what order. These steps are essentially just
23- functions. Pipeline provides an easy and effective way to log events inside these steps (it
24- automatically handles rendering and dissemination for these logs.)
23+ functions. Pipeline provides an easy and effective way to log events inside these steps (it
24+ automatically handles rendering and dissemination for these logs.)
2525
2626It also includes built-in progress indicator, which is essential since some of the jobs we run
2727in the pipeline are long-running tasks that require proper progress indicators. Pipeline provides
2828way to seamlessly records the progress (it automatically takes care of rendering and dissemination
2929of these progress).
3030
3131Additionally, the pipeline offers a consistent structure, making it easy to run these pipeline steps
32- with message queue like RQ and store all events related to a particular pipeline for
32+ with message queue like RQ and store all events related to a particular pipeline for
3333debugging/improvements.
3434
3535This tutorial contains all the things one should know to quickly implement an improver pipeline.
@@ -42,10 +42,10 @@ The new improver design lets you do all sorts of cool improvements and enhanceme
4242Some of those are:
4343
4444* Let's suppose you have a certain number of packages and vulnerabilities in your database,
45- and you want to make sure that the packages being shown in VulnerableCode do indeed exist upstream.
46- Oftentimes, we come across advisory data that contains made-up package versions. We can write
47- (well, we already have) a pipeline that iterates through all the packages in VulnerableCode and
48- labels them as ghost packages if they don't exist upstream.
45+ and you want to make sure that the packages being shown in VulnerableCode do indeed exist
46+ upstream. Oftentimes, we come across advisory data that contains made-up package versions.
47+ We can write (well, we already have) a pipeline that iterates through all the packages in
48+ VulnerableCode and labels them as ghost packages if they don't exist upstream.
4949
5050
5151- A basic security advisory only contains CVE/aliases, summary, fixed/affected version, and
@@ -64,17 +64,20 @@ be absolutely sure of what you're doing and should have robust tests for these p
6464Writing an Improver Pipeline
6565-----------------------------
6666
67- **Scenario: ** Suppose we come around a source that curates and stores the list of packages that don't
68- exist upstream and makes it available through the REST API endpoint https://example.org/api/non-existent-packages,
69- which gives a JSON response with a list of non-existent packages.
70- Let's write a pipeline that will use this source to flag these non-existent package as ghost package.
67+ **Scenario: ** Suppose we come around a source that curates and stores the list of packages that
68+ don't exist upstream and makes it available through the REST API endpoint
69+ https://example.org/api/non-existent-packages, which gives a JSON response with a list of
70+ non-existent packages.
71+
72+ Let's write a pipeline that will use this source to flag these non-existent package as
73+ ghost package.
7174
7275
7376Create file for the new improver pipeline
7477~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7578
7679All pipelines, including the improver pipeline, are located in the
77- `vulnerabilities/pipelines/
80+ `vulnerabilities/pipelines/
7881<https://github.com/aboutcode-org/vulnerablecode/tree/main/vulnerabilities/pipelines> `_ directory.
7982
8083The improver pipeline is implemented by subclassing `VulnerableCodePipeline `.
@@ -124,7 +127,7 @@ At this point improver will look like this:
124127
125128 def fetch_response (self ):
126129 raise NotImplementedError
127-
130+
128131 def flag_ghost_packages (self ):
129132 raise NotImplementedError
130133
@@ -194,7 +197,7 @@ Register the Improver Pipeline
194197------------------------------
195198
196199Finally, register your improver in the improver registry at
197- `vulnerabilities/improvers/__init__.py
200+ `vulnerabilities/improvers/__init__.py
198201<https://github.com/aboutcode-org/vulnerablecode/blob/main/vulnerabilities/improvers/__init__.py> `_
199202
200203
@@ -253,8 +256,8 @@ See :ref:`command_line_interface` for command line usage instructions.
253256
254257.. tip ::
255258
256- If you need to improve package vulnerability relations created using a certain pipeline,
257- simply use the **pipeline_id ** to filter out only those items. For example, if you want
259+ If you need to improve package vulnerability relations created using a certain pipeline,
260+ simply use the **pipeline_id ** to filter out only those items. For example, if you want
258261 to improve only those **AffectedByPackageRelatedVulnerability ** entries that were created
259262 by npm_importer pipeline, you can do so with the following query:
260263
0 commit comments