Skip to content

Commit 8d7b498

Browse files
committed
Support equivalent words in license detection #4190
Handle similar words in license detection by allowing multiple "legalese words" to have the same token id. Regenerate the tokens ids accordingly. Convert Index.tokens_by_tid to a computed property, available on demand. Convert tokens_by_tid to a dictionary from a list. Ensure that all code relying on the tokens_by_tid is updated as needed. All locations were used only for testing and debugging. Deprecate all rules that are duplicated under this new regime, where tokens like "license" and "licence" are not treated as identical. Update test suite to test the detection of all deprecated licenses and rules as a sanity check. A rule with "relevance" set to 0 is not tested if deprecated, as some rules are deprecated because they are false positive and should no longer be detected. Also improved the validation and loading of rules relevance, including the case for zero relevance. Update ambiguous or conflicting rules as needed. In particular ensure that all rules in the style of "MIT or GPL" without a GPL version are now reported consistently as: "mit or gpl-1.0-plus" Reference: #4190 Signed-off-by: Philippe Ombredanne <[email protected]>
1 parent e830934 commit 8d7b498

File tree

613 files changed

+5372
-4525
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

613 files changed

+5372
-4525
lines changed

src/licensedcode/data/rules/agpl-1.0-plus_3.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0-plus
33
is_license_reference: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-1.0-plus_4.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0-plus
33
is_license_reference: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-1.0-plus_43.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0-plus
33
is_license_tag: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-1.0-plus_44.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0-plus
33
is_license_tag: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-1.0_24.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0
33
is_license_tag: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-1.0_25.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0
33
is_license_tag: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-1.0_7.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0
33
is_license_reference: yes
4+
is_deprecated: yes
45
relevance: 95
56
---
67

src/licensedcode/data/rules/agpl-1.0_9.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-1.0
33
is_license_reference: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-2.0_25.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-2.0
33
is_license_tag: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

src/licensedcode/data/rules/agpl-2.0_26.RULE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
---
22
license_expression: agpl-2.0
33
is_license_tag: yes
4+
is_deprecated: yes
45
relevance: 100
56
---
67

0 commit comments

Comments
 (0)