Skip to content

Commit 8324538

Browse files
Fix doc lint errors
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
1 parent 6c6b25a commit 8324538

File tree

2 files changed

+75
-46
lines changed

2 files changed

+75
-46
lines changed
Lines changed: 74 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
Google Summer of Code 2021 Final report
22
=========================================
33

4+
45
Organisation - `AboutCode <https://www.aboutcode.org/>`_
56
---------------------------------------------------------
67

7-
Akanksha Garg <[email protected]>
8+
Akanksha Garg <[email protected]>
89

910
`GITHUB <https://github.com/akugarg>`_
1011

12+
1113
Project: Detect Unknown Licenses and Indirect License References in Scancode
1214
-----------------------------------------------------------------------------
1315

@@ -17,58 +19,84 @@ Project: Detect Unknown Licenses and Indirect License References in Scancode
1719

1820
`Proposal <https://docs.google.com/document/d/1Dp0Hgk38RIMwITTiS-kqfikpkHRi2rjtkotA9CLw8j0/edit?usp=sharing>`_
1921

22+
2023
Description
2124
------------
22-
- The main motive of this project was to improve license detection of unknown licenses and follow references to indirect license references in Scancode-TK
23-
24-
**Improvement in the License Data Model Definition**
25-
- Unknown Licenses are the ones which are matched to a license rule tagged with 'unknown' license key . Since these are some of the 'special' licenses , reporting them with special attributes will
26-
provide more clarification. Now unknown licenses are tagged with a new flag **"is_unknown"** to identify them beyond just the naming convention of having "unknown" as part of their name.
27-
Rules that match at least one unknown license have a flag **"has_unknown"** set
28-
in the returned match results.
29-
30-
`nexB/scancode-toolkit#2548 <https://github.com/nexB/scancode-toolkit/pull/2548>`_
31-
32-
**Reporting known and Unknown licenses separately**
33-
- We considered having a separate section for of scan results to report 'unknown licenses' separately and not mixed with main license detection results. But after implementing a separate section for
34-
unknown ones ,it doesn't seem to be good idea to have currently.
35-
36-
`nexB/scancode-toolkit#2578 <https://github.com/nexB/scancode-toolkit/pull/2578>`_
37-
38-
**Follow License References to another file**
39-
- Some license references such as "see license in file LICENSE.txt" e.g. mentions to look for license details in another file are reported as unknown license references and we could instead follow
40-
the referenced file to find what was detected there. The approach was to use already contained attribute ```refrenced_filenames``` in license RULE data files. Since this was a ```process_codebase```
41-
step in scan plugin , it was needed that our API function should return ```refrenced_filenames``` to keep track of these files corresponding to licenses detected. This was tracked in -
42-
43-
`nexB/scancode-toolkit#2632 <https://github.com/nexB/scancode-toolkit/pull/2632>`_
44-
45-
- The ```process_codebase``` step is tracked in -
46-
47-
`nexB/scancode-toolkit#2616 <https://github.com/nexB/scancode-toolkit/pull/2616>`_
48-
49-
**Improve license detection of Unknown Licenses**
50-
- The approach was to use index of n-grams for detecting unknowns besides having our actual detection of "unknown" license rules. Firstly matches were filtered after running our normal procedure
51-
of license detection and the remaining spans are run through a automaton index containing n-grams from all regular license texts and rules. This is tracked in-
52-
53-
`nexB/scancode-toolkit#2592 <https://github.com/nexB/scancode-toolkit/pull/2592>`_
54-
55-
**Addition of some new Licenses**
56-
- There were some licenses that were not present in Scancode-toolkit as for now. They have been added now.
57-
58-
`nexB/scancode-toolkit#2625 <https://github.com/nexB/scancode-toolkit/pull/2625>`_
59-
60-
25+
26+
The main motive of this project was to improve license detection of unknown licenses
27+
and follow references to indirect license references in Scancode-TK
28+
29+
**Improvement in the License Data Model Definition**
30+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
31+
32+
Unknown Licenses are the ones which are matched to a license rule tagged with 'unknown' license
33+
key. Since these are some of the 'special' licenses , reporting them with special attributes
34+
will provide more clarification. Now unknown licenses are tagged with a new flag **"is_unknown"**
35+
to identify them beyond just the naming convention of having "unknown" as part of their name.
36+
37+
Rules that match at least one unknown license have a flag **"has_unknown"** set
38+
in the returned match results.
39+
40+
`nexB/scancode-toolkit#2548 <https://github.com/nexB/scancode-toolkit/pull/2548>`_
41+
42+
**Reporting known and Unknown licenses separately**
43+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
44+
45+
We considered having a separate section for of scan results to report 'unknown licenses'
46+
separately and not mixed with main license detection results. But after implementing
47+
a separate section for unknown ones ,it doesn't seem to be good idea to have currently.
48+
49+
`nexB/scancode-toolkit#2578 <https://github.com/nexB/scancode-toolkit/pull/2578>`_
50+
51+
**Follow License References to another file**
52+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
53+
54+
Some license references such as "see license in file LICENSE.txt" e.g. mentions to look
55+
for license details in another file are reported as unknown license references and
56+
we could instead follow the referenced file to find what was detected there. The approach
57+
was to use already contained attribute ``refrenced_filenames`` in license RULE data files.
58+
Since this was a ``process_codebase`` step in scan plugin , it was needed that our API function
59+
should return ``refrenced_filenames`` to keep track of these files corresponding to licenses
60+
detected. This was tracked in -
61+
62+
`nexB/scancode-toolkit#2632 <https://github.com/nexB/scancode-toolkit/pull/2632>`_
63+
64+
The ```process_codebase``` step is tracked in -
65+
66+
`nexB/scancode-toolkit#2616 <https://github.com/nexB/scancode-toolkit/pull/2616>`_
67+
68+
**Improve license detection of Unknown Licenses**
69+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
70+
71+
The approach was to use index of n-grams for detecting unknowns besides having our actual
72+
detection of "unknown" license rules. Firstly matches were filtered after running our normal
73+
procedure of license detection and the remaining spans are run through a automaton index
74+
containing n-grams from all regular license texts and rules. This is tracked in -
75+
76+
`nexB/scancode-toolkit#2592 <https://github.com/nexB/scancode-toolkit/pull/2592>`_
77+
78+
**Addition of some new Licenses**
79+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
80+
81+
There were some licenses that were not present in Scancode-toolkit as for now.
82+
They have been added now.
83+
84+
`nexB/scancode-toolkit#2625 <https://github.com/nexB/scancode-toolkit/pull/2625>`_
85+
86+
6187
Pre-GSoC
62-
----------
63-
88+
--------
89+
6490
**Contributions**
65-
91+
6692
- `nexB/scancode-toolkit#2423 <https://github.com/nexB/scancode-toolkit/pull/2423>`_
6793
- `nexB/scancode-toolkit#2473 <https://github.com/nexB/scancode-toolkit/pull/2473>`_
6894
- `nexB/scancode-toolkit#2464 <https://github.com/nexB/scancode-toolkit/pull/2464>`_
6995
- `nexB/scancode-toolkit#2381 <https://github.com/nexB/scancode-toolkit/pull/2381>`_
70-
71-
I’ve had a wonderful summer during these 10 weeks journey and have learned plenty of things. I am thankful to Google and Aboutcode for giving me this opportunity to work with such an amazing
72-
community. I am fortunate to have mentors `Philippe Ombredanne <https://github.com/pombredanne>`_ and `Ayan Sinha Mahapatra <https://github.com/AyanSinhaMahapatra>`_ who helped me a lot throughout
96+
97+
I’ve had a wonderful summer during these 10 weeks journey and have learned plenty of things.
98+
I am thankful to Google and Aboutcode for giving me this opportunity to work with such an amazing
99+
community. I am fortunate to have mentors `Philippe Ombredanne <https://github.com/pombredanne>`_
100+
and `Ayan Sinha Mahapatra <https://github.com/AyanSinhaMahapatra>`_ who helped me a lot throughout
73101
my GSoC project and provided constant support.
74102

docs/source/contribute/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@
1010
roadmap
1111
gsoc17_final_report
1212
gsoc19_final_report
13+
gsoc21_final_report
1314
long_running_issues

0 commit comments

Comments
 (0)