Skip to content

Commit 449da5c

Browse files
authored
Make a pass over setup, docs, and top-level things (#9)
* README.rst: move customization from main.py here * Makefile: LoadModule name has changes * __init__.py: list imports, go over docstring which forms chapter information * main.py: numerous small doc changes; fix some type errors. * setup.py and spacy, langid, and pyenchant are no longer optional. Note pyenchant rather than enchant is now used.
1 parent ca6dfbf commit 449da5c

File tree

6 files changed

+224
-152
lines changed

6 files changed

+224
-152
lines changed

.github/workflows/osx.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ jobs:
2424
python-version: ${{ matrix.python-version }}
2525
- name: Install dependencies
2626
run: |
27-
brew install llvm@11
27+
brew install llvm@11 enchant
2828
python -m pip install --upgrade pip
2929
LLVM_CONFIG=/usr/local/Cellar/llvm@11/11.1.0/bin/llvm-config pip install llvmlite
3030
brew install mariadb

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ pytest:
7272

7373

7474
doctest:
75-
MATHICS_CHARACTER_ENCODING="ASCII" $(PYTHON) -m mathics.docpipeline -l pymathics.natlang -c "Pymathics Natlang" $o
75+
MATHICS_CHARACTER_ENCODING="ASCII" $(PYTHON) -m mathics.docpipeline -l pymathics.natlang -c "Natural Language Processing" $o
7676

7777

7878
# #: Make Mathics PDF manual

README.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,34 @@ You might be able to fix this running:
6464

6565
Adjust "python" and "en" (the language you want) above as needed.
6666

67+
68+
User customization
69+
------------------
70+
71+
For nltk, use the environment variable ``NLTK_DATA`` to specify a
72+
custom data path (instead of $HOME/.nltk). For spacy, set
73+
'MATHICS3_SPACY_DATA', a Mathics3-specific variable.
74+
75+
In order to use the Extended Open Multilingual Wordnet (OMW) with 'NLTK'
76+
and use even more languages, you need to install them manually.
77+
78+
Go to http://compling.hss.ntu.edu.sg/omw/summx.html, download the data, and then create a new folder under
79+
``$HOME/nltk_data/corpora/omw/your_language`` where you put the file from
80+
wiki/wn-wikt-your_language.tab, and rename it to
81+
wn-data-your_language.tab.
82+
83+
Adding more languages to Open Multilingual Wordnet:
84+
85+
In order to use the Extended Open Multilingual Wordnet with NLTK and
86+
use even more languages, you need to install them manually. Go to
87+
http://compling.hss.ntu.edu.sg/omw/summx.html, download the data, and
88+
then create a new folder under
89+
$HOME/nltk_data/corpora/omw/your_language where you put the file from
90+
wiki/wn-wikt-your_language.tab, and rename it to
91+
wn-data-your_language.tab.
92+
93+
94+
6795
.. |Latest Version| image:: https://badge.fury.io/py/pymathics-natlang.svg
6896
:target: https://badge.fury.io/py/pymathics-natlang
6997
.. |Pypi Installs| image:: https://pepy.tech/badge/pymathics-natlang

pymathics/natlang/__init__.py

Lines changed: 86 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,96 @@
1-
"""Pymathics Natlang
1+
"""
2+
Natural Language Processing
3+
4+
Mathics3 Module module provides functions and variables to work with \
5+
expressions in natural language, using the libraries:
6+
7+
<ul>
8+
<li><url>:spacy:
9+
https://spacy.io/</url> for parsing natural languages</url>
10+
<li><url>
11+
:nltk:
12+
https://www.nltk.org/</url> for functions using WordNet-related builtins
13+
<li><url>
14+
:pyenchant:
15+
https://pyenchant.github.io/pyenchant/</url> and <url>
16+
:pycountry:
17+
https://pypi.org/project/pycountry/</url> for language identification
18+
</ul>
19+
20+
Examples:
21+
22+
>> LoadModule["pymathics.natlang"]
23+
= pymathics.natlang
24+
25+
>> Pluralize["try"]
26+
= tries
227
3-
This module provides Mathics functions and variables to work with \
4-
expressions in natural language, using the libraries 'nltk' and \
5-
'spacy'.
28+
>> LanguageIdentify["eins zwei drei"]
29+
= German
30+
31+
>> WordFrequency["Apple Tree and apple", "apple", IgnoreCase -> True]
32+
= 0.5
33+
34+
>> TextCases["I was in London last year.", "Pronoun"]
35+
= {I}
36+
37+
>> DeleteStopwords["There was an Old Man of Apulia, whose conduct was very peculiar"]
38+
= Old Man Apulia, conduct peculiar
639
"""
740

841

9-
from pymathics.natlang.main import *
42+
from pymathics.natlang.main import (
43+
DeleteStopwords,
44+
DictionaryLookup,
45+
DictionaryWordQ,
46+
LanguageIdentify,
47+
Pluralize,
48+
RandomWord,
49+
SpellingCorrectionList,
50+
TextCases,
51+
TextPosition,
52+
TextSentences,
53+
TextStructure,
54+
TextWords,
55+
WordCount,
56+
WordData,
57+
WordDefinition,
58+
WordFrequency,
59+
WordFrequencyData,
60+
WordList,
61+
WordSimilarity,
62+
WordStem,
63+
)
1064
from pymathics.natlang.version import __version__
1165

12-
1366
pymathics_version_data = {
14-
"author": "The Mathics Team",
67+
"author": "The Mathics3 Team",
1568
"version": __version__,
1669
"name": "Natlang",
17-
"requires": ["nltk", "spacy"],
70+
"requires": ["langid", "pyenchant", "nltk", "spacy"],
1871
}
72+
73+
__all__ = [
74+
"DeleteStopwords",
75+
"DictionaryLookup",
76+
"DictionaryWordQ",
77+
"LanguageIdentify",
78+
"Pluralize",
79+
"RandomWord",
80+
"SpellingCorrectionList",
81+
"TextCases",
82+
"TextPosition",
83+
"TextSentences",
84+
"TextStructure",
85+
"TextWords",
86+
"WordCount",
87+
"WordData",
88+
"WordDefinition",
89+
"WordFrequency",
90+
"WordFrequencyData",
91+
"WordList",
92+
"WordSimilarity",
93+
"WordStem",
94+
"__version__",
95+
"pymathics_version_data",
96+
]

0 commit comments

Comments
 (0)