Skip to content

Commit f96c58c

Browse files
committed
update README.md
1 parent 96264e0 commit f96c58c

File tree

1 file changed

+37
-1
lines changed

1 file changed

+37
-1
lines changed

README.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ The current version includes (but is not limited to) the following functions:
2828
- searching for association rules,
2929
- providing output of mined association rules,
3030
- generating statistics about mined association rules,
31-
- visualization of association rules.
31+
- visualization of association rules,
32+
- association rule text mining (experimental).
3233

3334
## Installation
3435

@@ -159,6 +160,37 @@ plt.show()
159160
</p>
160161

161162

163+
### Text Mining (Experimental)
164+
165+
An experimental implementation of association rule text mining using nature-inspired algorithms, based on ideas from [5]
166+
is also provided. The `niaarm.text` module contains the `Corpus` and `Document` classes for loading and preprocessing corpora,
167+
a `TextRule` class, representing a text rule, and the `NiaARTM` class, implementing association rule text mining
168+
as a continuous optimization problem. The `get_text_rules` function, equivalent to `get_rules`, but for text mining, was also
169+
added to the `niaarm.mine` module.
170+
171+
```python
172+
import pandas as pd
173+
from niaarm.text import Corpus
174+
from niaarm.mine import get_text_rules
175+
from niapy.algorithms.basic import ParticleSwarmOptimization
176+
177+
df = pd.read_json('datasets/text/artm_test_dataset.json', orient='records')
178+
documents = df['text'].tolist()
179+
corpus = Corpus.from_list(documents)
180+
181+
algorithm = ParticleSwarmOptimization(population_size=200, seed=123)
182+
metrics = ('support', 'confidence', 'aws')
183+
rules, time = get_text_rules(corpus, max_terms=5, algorithm=algorithm, metrics=metrics, max_evals=10000, logging=True)
184+
185+
if len(rules):
186+
print(rules)
187+
print(f'Run time: {time:.2f}s')
188+
rules.to_csv('output.csv')
189+
else:
190+
print('No rules generated')
191+
print(f'Run time: {time:.2f}s')
192+
```
193+
162194
For a full list of examples see the [examples folder](https://github.com/firefly-cpp/NiaARM/tree/main/examples)
163195
in the GitHub repository.
164196

@@ -218,6 +250,10 @@ Ideas are based on the following research papers:
218250
In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020.
219251
IDEAL 2020. Lecture Notes in Computer Science(), vol 12489. Springer, Cham. https://doi.org/10.1007/978-3-030-62362-3_10
220252

253+
[5] I. Fister, S. Deb, I. Fister, „Population-based metaheuristics for Association Rule Text Mining“,
254+
In: Proceedings of the 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence,
255+
New York, NY, USA, mar. 2020, pp. 19–23. doi: 10.1145/3396474.3396493.
256+
221257
## License
222258

223259
This package is distributed under the MIT License. This license can be found online at <http://www.opensource.org/licenses/MIT>.

0 commit comments

Comments
 (0)