Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions free_packages_foss.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
### FOSS-Compliant and Public Domain Licenses
This section includes packages with explicit, standard, and legally clear FOSS licenses.

* **averaged_perceptron_tagger**: License: MIT License
* **averaged_perceptron_tagger_eng**: License: MIT License
* **averaged_perceptron_tagger_ru**: License: MIT License
* **averaged_perceptron_tagger_rus**: License: MIT License
* **bcp47**: License: IETF Trust and Unicode Inc.
* **biocreative_ppi**: License: Public Domain
* **comparative_sentences**: License: Creative Commons Attribution 4.0 International
* **crubadan**: License: GPLv3
* **english_wordnet**: License: Creative Commons Attribution 4.0 International
* **extended_omw**: License: CC by SA 3.0 and Unicode, Inc.
* **framenet_v17**: License: Creative Commons Attribution 3.0 Unported
* **genesis**: License: public domain
* **gutenberg**: License: public domain
* **inaugural**: License: public domain
* **machado**: License: Public Domain
* **masc_tagged**: License: "use for... commercial development" (FOSS-compatible).
* **mock_corpus**: License: Public Domain
* **movie_reviews**: License: Creative Commons Attribution 4.0 International
* **mwa_ppdb**: License: Creative Commons Attribution 3.0 Unported
* **nonbreaking_prefixes**: License: Gnu LGPL
* **opinion_lexicon**: License: Creative Commons Attribution 4.0 International
* **panlex_swadesh**: License: CC0 1.0 Universal
* **pl196x**: License: GNU General Public License
* **porter_test**: License: Unstated (Implicitly FOSS).
* **pros_cons**: License: Creative Commons Attribution 4.0 International
* **product_reviews_1**: License: Creative Commons Attribution 4.0 International
* **product_reviews_2**: License: Creative Commons Attribution 4.0 International
* **sentence_polarity**: License: Creative Commons Attribution 4.0 International
* **sentiwordnet**: License: Creative Commons Attribution ShareAlike 3.0
* **shakespeare**: License: public domain
* **snowball_data**: License: Unstated (Implicitly FOSS).
* **state_union**: License: public domain
* **stopwords**: License: public domain
* **subjectivity**: License: Creative Commons Attribution 4.0 International
* **swadesh**: License: GNU Free Documentation License (FOSS-compatible).
* **tagsets**: License: Unstated (Implicitly FOSS).
* **tagsets_json**: License: Unstated (Implicitly FOSS).
* **udhr**: License: public domain
* **udhr2**: License: public domain
* **unicode_samples**: License: Unstated (Implicitly FOSS).
* **universal_tagset**: License: CC-BY-SA-4.0
* **vader_lexicon**: License: MIT License
* **webtext**: License: Unstated (Implicitly FOSS).
* **wmt15_eval**: License: Unstated (Implicitly FOSS).
* **word2vec_sample**: License: Unstated (Implicitly FOSS).
* **wordnet**: License: Princeton Open Source License
* **wordnet2021**: License: Creative Commons Attribution 4.0 International
* **wordnet2022**: License: Creative Commons Attribution 4.0 International
* **wordnet31**: License: Princeton Open Source License
* **wordnet_ic**: License: Unstated (Implicitly FOSS).
* **words**: License: public domain

### Rescued Packages (Assumed Free)
This section includes packages that are widely used and have a clear intention of being free, but their license statement is unstated or ambiguous from a strict FOSS compliance standpoint.

* **alpino**: License: "Distributed with permission." (FOSS-compatible, but non-standard).
* **cmudict**: License: "Unrestricted use." (FOSS-compatible, but non-standard).
* **omw**: License: "Please consult the LICENSE files... Note that all permit redistribution." (This is a known "license trap." While the project's intention is free, the ambiguous summary prevents simple redistribution.)
* **omw-1.4**: License: "Please consult the LICENSE files... Note that all permit redistribution." (Same as above, a known "license trap" assumed to be free based on community trust).
* **punkt**: License: Unstated (This package's license is a known issue. While widely assumed to be free, a formal FOSS-compliant license is needed for clarity).
* **punkt_tab**: License: Unstated (Same as above, assumed free based on community trust).
* **rslp**: License: Unstated (Implicitly FOSS).
* **rte**: License: Unstated (Implicitly FOSS).
* **smultron**: License: Unstated (Likely FOSS-compatible, but needs a clear license).
56 changes: 56 additions & 0 deletions nonfree_packages_foss.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
### Fundamentally Non-Compliant Licenses
This section lists packages with licenses that explicitly forbid commercial use, restrict distribution, or are otherwise incompatible with FOSS principles.

* **brown**: License: LDC - non-commercial use only.
* **semcor**: License: LDC - non-commercial use only (annotations by Princeton).
* **brown_tei**: License: "May be used for non-commercial purposes."
* **chat80**: License: "only for academic purposes" and forbids commercial use.
* **conll2007**: License: Creative Commons Attribution-NonCommercial-NoDerivativeWorks.
* **dependency_treebank**: License: "for non-commercial use only."
* **floresta**: License: "Non-commercial use only."
* **framenet_v15**: License: "May be used for non-commercial purposes."
* **mte_teip5**: License: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 (Non-Commercial clause).
* **nps_chat**: License: "solely for non-commercial, non-profit educational and research use."
* **reuters**: License: "for research purposes only."
* **sinica_treebank**: License: Creative Commons Attribution-NonCommercial-ShareAlike.
* **timit**: License: Creative Commons Attribution, Non-Commercial, ShareAlike.
* **treebank**: License: "for non-commercial use only."
* **twitter_samples**: License: "Must be used subject to Twitter Developer Agreement."
* **universal_treebanks_v20**: License: Creative Commons Attribution-NonCommercial-ShareAlike.

### Restricted/Ambiguous Licenses
This section lists packages with licenses that are either non-standard, require special permission, or are too ambiguous to be considered FOSS-compliant for redistribution.

* **basque_grammars**: License: Unstated (Ambiguous).
* **bllip_wsj_no_aux**: License: Unstated (Ambiguous).
* **book_grammars**: License: Unstated (Ambiguous).
* **conll2000**: License: Unstated (Ambiguous).
* **conll2002**: License: Unstated (Ambiguous).
* **europarl_raw**: License: Unstated (Ambiguous).
* **gazetteers**: License: "GNU Free Documentation License; or public domain (depending on the file)" (Ambiguous).
* **ieer**: License: Unstated (Ambiguous).
* **indian**: License: "Distributed with permission" (Ambiguous).
* **jeita**: License: "re-distributable under the same license as the original" (Ambiguous).
* **kimmo**: License: Unstated (Ambiguous).
* **knbc**: License: "re-distributable under the same license as the original" (Ambiguous).
* **large_grammars**: License: "See the individual grammar files" (Ambiguous).
* **lin_thesaurus**: License: "Distributed with permission of Dekang Lin" (Restricted).
* **mac_morpho**: License: "Distributed with permission" (Restricted).
* **moses_sample**: License: Unstated (Ambiguous).
* **names**: License: "I retain the copyright... but are freely redistributable." (Ambiguous).
* **nombank.1.0**: License: "Distributed with permission" (Restricted).
* **paradigms**: License: "Distributed with the permission of the author" (Restricted).
* **pe08**: License: "Distributed with permission" (Restricted).
* **perluniprops**: License: Unstated (Ambiguous).
* **pil**: License: "Distributed with permission" (Restricted).
* **propbank**: License: "Distributed with permission" (Restricted).
* **ppattach**: License: "Distributed with the permission of the author." (Restricted).
* **qc**: License: Unstated (Ambiguous).
* **sample_grammars**: License: Unstated (Ambiguous).
* **senseval**: License: "Distributed with permission." (Restricted).
* **spanish_grammars**: License: Unstated (Ambiguous).
* **switchboard**: License: Open Content License (not FOSS).
* **toolbox**: License: Unstated (Ambiguous).
* **verbnet**: License: "Distributed with permission of the author." (Restricted).
* **verbnet3**: License: "Distributed with permission of the author." (Restricted).
* **ycoe**: License: Unstated (Ambiguous).