You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- The main motive of this project was to improve license detection of unknown licenses and follow references to indirect license references in Scancode-TK
23
-
24
-
**Improvement in the License Data Model Definition**
25
-
- Unknown Licenses are the ones which are matched to a license rule tagged with 'unknown' license key . Since these are some of the 'special' licenses , reporting them with special attributes will
26
-
provide more clarification. Now unknown licenses are tagged with a new flag **"is_unknown"** to identify them beyond just the naming convention of having "unknown" as part of their name.
27
-
Rules that match at least one unknown license have a flag **"has_unknown"** set
**Reporting known and Unknown licenses separately**
33
-
- We considered having a separate section for of scan results to report 'unknown licenses' separately and not mixed with main license detection results. But after implementing a separate section for
34
-
unknown ones ,it doesn't seem to be good idea to have currently.
- Some license references such as "see license in file LICENSE.txt" e.g. mentions to look for license details in another file are reported as unknown license references and we could instead follow
40
-
the referenced file to find what was detected there. The approach was to use already contained attribute ```refrenced_filenames``` in license RULE data files. Since this was a ```process_codebase```
41
-
step in scan plugin , it was needed that our API function should return ```refrenced_filenames``` to keep track of these files corresponding to licenses detected. This was tracked in -
- The approach was to use index of n-grams for detecting unknowns besides having our actual detection of "unknown" license rules. Firstly matches were filtered after running our normal procedure
51
-
of license detection and the remaining spans are run through a automaton index containing n-grams from all regular license texts and rules. This is tracked in-
I’ve had a wonderful summer during these 10 weeks journey and have learned plenty of things. I am thankful to Google and Aboutcode for giving me this opportunity to work with such an amazing
72
-
community. I am fortunate to have mentors `Philippe Ombredanne <https://github.com/pombredanne>`_ and `Ayan Sinha Mahapatra <https://github.com/AyanSinhaMahapatra>`_ who helped me a lot throughout
96
+
97
+
I’ve had a wonderful summer during these 10 weeks journey and have learned plenty of things.
98
+
I am thankful to Google and Aboutcode for giving me this opportunity to work with such an amazing
99
+
community. I am fortunate to have mentors `Philippe Ombredanne <https://github.com/pombredanne>`_
100
+
and `Ayan Sinha Mahapatra <https://github.com/AyanSinhaMahapatra>`_ who helped me a lot throughout
0 commit comments