Skip to content

Improve MIT detection #4269

@pombredanne

Description

@pombredanne

Following up from comments posted in @alok1304 's #4121 (comment)_

You can improve this further this way:

  1. create tests adding a test and expected file in https://github.com/aboutcode-org/scancode-toolkit/tree/develop/tests/licensedcode/data/datadriven/lic4 ... see all examples of test file pairs there.

The test for #3860 and #3861 would be the same with this text (like for https://github.com/aboutcode-org/scancode-toolkit/blob/develop/tests/licensedcode/data/datadriven/lic4/2675-sqlite.cpp )

# Copyright: (c) 2020, Jordan Borean (@jborean93) <[email protected]>
# MIT License (see LICENSE or https://opensource.org/licenses/MIT)

And expected YAML file, (like for https://github.com/aboutcode-org/scancode-toolkit/blob/develop/tests/licensedcode/data/datadriven/lic4/2675-sqlite.cpp.yml )

license_expressions:
  - mit
  1. Also add a few new rules with this related contents (this can be a separate PR alright):
---
license_expression: mit
is_license_notice: yes
relevance: 100
referenced_filenames:
    - LICENSE
ignorable_urls:
    - https://opensource.org/licenses/MIT
---

{{MIT License (see LICENSE or https://opensource.org/licenses/MIT) }}

And another:

---
license_expression: mit
is_license_notice: yes
relevance: 100
referenced_filenames:
    - LICENSE
---

{{MIT License (see LICENSE) }}

And a few variations that can bee seen in the wild if we do not detect these exactly:

And all the variations where there is a LICENSE.txt:

And a few rst:

And more variants:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions