Skip to content

Commit 3de94e6

Browse files
committed
updated docs
1 parent d30c729 commit 3de94e6

File tree

5 files changed

+131
-22
lines changed

5 files changed

+131
-22
lines changed

docs/api/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,5 @@ API Reference
99
niaarm
1010
rule
1111
rule_list
12+
text
1213
visualize

docs/api/text.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Text
2+
====
3+
4+
.. automodule:: niaarm.text
5+
:members:
6+
:show-inheritance:

docs/getting_started.rst

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,68 @@ presented in `this paper <https://link.springer.com/chapter/10.1007/978-3-030-62
217217

218218
.. image:: _static/hill_slopes.png
219219

220+
Text Mining (Experimental)
221+
~~~~~~~~~~~~~~~~~~~~~~~~~~
222+
223+
An experimental implementation of association rule text mining using nature-inspired algorithms
224+
is also provided. The :mod:`niaarm.text` module contains the :class:`~niaarm.text.Corpus` and :class:`~niaarm.text.Document` classes for loading and preprocessing corpora,
225+
a :class:`~niaarm.text.TextRule` class, representing a text rule, and the :class:`~niaarm.text.NiaARTM` class, implementing association rule text mining
226+
as a continuous optimization problem. The :func:`~niaarm.mine.get_text_rules` function, equivalent to :func:`~niaarm.mine.get_rules`, but for text mining, was also
227+
added to the :mod:`niaarm.mine` module.
228+
229+
.. code:: python
230+
231+
import pandas as pd
232+
from niaarm.text import Corpus
233+
from niaarm.mine import get_text_rules
234+
from niapy.algorithms.basic import ParticleSwarmOptimization
235+
236+
df = pd.read_json('datasets/text/artm_test_dataset.json', orient='records')
237+
documents = df['text'].tolist()
238+
corpus = Corpus.from_list(documents)
239+
240+
algorithm = ParticleSwarmOptimization(population_size=200, seed=123)
241+
metrics = ('support', 'confidence', 'aws')
242+
rules, time = get_text_rules(corpus, max_terms=5, algorithm=algorithm, metrics=metrics, max_evals=10000, logging=True)
243+
244+
if len(rules):
245+
print(rules)
246+
print(f'Run time: {time:.2f}s')
247+
rules.to_csv('output.csv')
248+
else:
249+
print('No rules generated')
250+
print(f'Run time: {time:.2f}s')
251+
252+
**Output:**
253+
254+
.. code:: text
255+
256+
Fitness: 0.53345778328699, Support: 0.1111111111111111, Confidence: 1.0, Aws: 0.48926223874985886
257+
Fitness: 0.7155830770302328, Support: 0.1111111111111111, Confidence: 1.0, Aws: 1.0356381199795872
258+
Fitness: 0.7279963436805833, Support: 0.1111111111111111, Confidence: 1.0, Aws: 1.072877919930639
259+
Fitness: 0.7875917299029188, Support: 0.1111111111111111, Confidence: 1.0, Aws: 1.251664078597645
260+
Fitness: 0.8071206688346807, Support: 0.1111111111111111, Confidence: 1.0, Aws: 1.310250895392931
261+
STATS:
262+
Total rules: 52
263+
Average fitness: 0.5179965084882088
264+
Average support: 0.11538461538461527
265+
Average confidence: 0.7115384615384616
266+
Average lift: 5.524038461538462
267+
Average coverage: 0.17948717948717943
268+
Average consequent support: 0.1517094017094015
269+
Average conviction: 1568561408678185.8
270+
Average amplitude: nan
271+
Average inclusion: 0.007735042735042727
272+
Average interestingness: 0.6170069642291859
273+
Average comprehensibility: 0.6763685578758655
274+
Average netconf: 0.6675824175824177
275+
Average Yule's Q: 0.9670329670329672
276+
Average antecedent length: 1.6346153846153846
277+
Average consequent length: 1.8461538461538463
278+
279+
Run time: 13.37s
280+
Rules exported to output.csv
281+
220282
Interest Measures
221283
-----------------
222284

docs/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@ The current version includes (but is not limited to) the following functions:
2424
- searching for association rules,
2525
- providing output of mined association rules,
2626
- generating statistics about mined association rules,
27-
- visualization of association rules.
27+
- visualization of association rules,
28+
- association rule text mining (experimental).
2829

2930
Documentation
3031
=============

docs/refs.bib

Lines changed: 60 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,66 @@
1-
@inproceedings{fister2018differential,
2-
title={Differential evolution for association rule mining using categorical and numerical attributes},
3-
author={Fister Jr., Iztok and Iglesias, Andres and Galvez, Akemi and Ser, Javier Del and Osaba, Eneko and Fister, Iztok},
4-
booktitle={International conference on intelligent data engineering and automated learning},
5-
pages={79--88},
6-
year={2018},
7-
organization={Springer}
1+
@inproceedings{fister_differential_2018,
2+
address = {Cham},
3+
title = {Differential {Evolution} for {Association} {Rule} {Mining} {Using} {Categorical} and {Numerical} {Attributes}},
4+
isbn = {9783030034931},
5+
doi = {10.1007/978-3-030-03493-1_9},
6+
language = {en},
7+
booktitle = {Intelligent {Data} {Engineering} and {Automated} {Learning} – {IDEAL} 2018},
8+
publisher = {Springer International Publishing},
9+
author = {Fister, Iztok and Iglesias, Andres and Galvez, Akemi and Del Ser, Javier and Osaba, Eneko and Fister, Iztok},
10+
editor = {Yin, Hujun and Camacho, David and Novais, Paulo and Tallón-Ballesteros, Antonio J.},
11+
year = {2018},
12+
pages = {79--88},
813
}
914

10-
@inproceedings{fister2020improved,
11-
title={Improved nature-inspired algorithms for numeric association rule mining},
12-
author={Fister Jr, Iztok and Podgorelec, Vili and Fister, Iztok},
13-
booktitle={International Conference on Intelligent Computing \& Optimization},
14-
pages={187--195},
15-
year={2020},
16-
organization={Springer}
15+
@inproceedings{fister_jr_improved_2021,
16+
address = {Cham},
17+
title = {Improved {Nature}-{Inspired} {Algorithms} for {Numeric} {Association} {Rule} {Mining}},
18+
isbn = {9783030681548},
19+
doi = {10.1007/978-3-030-68154-8_19},
20+
language = {en},
21+
booktitle = {Intelligent {Computing} and {Optimization}},
22+
publisher = {Springer International Publishing},
23+
author = {Fister Jr., Iztok and Podgorelec, Vili and Fister, Iztok},
24+
editor = {Vasant, Pandian and Zelinka, Ivan and Weber, Gerhard-Wilhelm},
25+
year = {2021},
26+
pages = {187--195},
1727
}
1828

29+
@article{fister_jr_brief_2020,
30+
title = {A brief overview of swarm intelligence-based algorithms for numerical association rule mining},
31+
doi = {10.48550/ARXIV.2010.15524},
32+
abstract = {Numerical Association Rule Mining is a popular variant of Association Rule Mining, where numerical attributes are handled without discretization. This means that the algorithms for dealing with this problem can operate directly, not only with categorical, but also with numerical attributes. Until recently, a big portion of these algorithms were based on a stochastic nature-inspired population-based paradigm. As a result, evolutionary and swarm intelligence-based algorithms showed big efficiency for dealing with the problem. In line with this, the main mission of this chapter is to make a historical overview of swarm intelligence-based algorithms for Numerical Association Rule Mining, as well as to present the main features of these algorithms for the observed problem. A taxonomy of the algorithms was proposed on the basis of the applied features found in this overview. Challenges, waiting in the future, finish this paper.},
33+
journal = {arXiv:2010.15524 [cs]},
34+
author = {Fister Jr. , Iztok and Fister, Iztok},
35+
month = oct,
36+
year = {2020},
37+
}
38+
39+
@inproceedings{fister_population-based_2020,
40+
address = {New York, NY, USA},
41+
series = {{ISMSI} '20},
42+
title = {Population-based metaheuristics for {Association} {Rule} {Text} {Mining}},
43+
isbn = {9781450377614},
44+
doi = {10.1145/3396474.3396493},
45+
booktitle = {Proceedings of the 2020 4th {International} {Conference} on {Intelligent} {Systems}, {Metaheuristics} \& {Swarm} {Intelligence}},
46+
publisher = {Association for Computing Machinery},
47+
author = {Fister, Iztok and Deb, Suash and Fister, Iztok},
48+
month = mar,
49+
year = {2020},
50+
keywords = {association rule text mining, particle swarm optimization, triathlon, natural language processing, optimization},
51+
pages = {19--23},
52+
}
1953

20-
@article{fister2021brief,
21-
title={A Brief Overview of Swarm Intelligence-Based Algorithms for Numerical Association Rule Mining},
22-
author={Fister Jr, Iztok and Fister, Iztok},
23-
journal={Applied Optimization and Swarm Intelligence},
24-
pages={47--59},
25-
year={2021},
26-
publisher={Springer}
54+
@inproceedings{fister_visualization_2020,
55+
address = {Cham},
56+
title = {Visualization of {Numerical} {Association} {Rules} by {Hill} {Slopes}},
57+
isbn = {9783030623623},
58+
doi = {10.1007/978-3-030-62362-3_10},
59+
language = {en},
60+
booktitle = {Intelligent {Data} {Engineering} and {Automated} {Learning} – {IDEAL} 2020},
61+
publisher = {Springer International Publishing},
62+
author = {Fister, Iztok and Fister, Dušan and Iglesias, Andres and Galvez, Akemi and Osaba, Eneko and Del Ser, Javier and Fister, Iztok},
63+
editor = {Analide, Cesar and Novais, Paulo and Camacho, David and Yin, Hujun},
64+
year = {2020},
65+
pages = {101--111},
2766
}

0 commit comments

Comments
 (0)