@@ -28,7 +28,8 @@ The current version includes (but is not limited to) the following functions:
2828- searching for association rules,
2929- providing output of mined association rules,
3030- generating statistics about mined association rules,
31- - visualization of association rules.
31+ - visualization of association rules,
32+ - association rule text mining (experimental).
3233
3334## Installation
3435
@@ -159,6 +160,37 @@ plt.show()
159160</p >
160161
161162
163+ ### Text Mining (Experimental)
164+
165+ An experimental implementation of association rule text mining using nature-inspired algorithms, based on ideas from [ 5]
166+ is also provided. The ` niaarm.text ` module contains the ` Corpus ` and ` Document ` classes for loading and preprocessing corpora,
167+ a ` TextRule ` class, representing a text rule, and the ` NiaARTM ` class, implementing association rule text mining
168+ as a continuous optimization problem. The ` get_text_rules ` function, equivalent to ` get_rules ` , but for text mining, was also
169+ added to the ` niaarm.mine ` module.
170+
171+ ``` python
172+ import pandas as pd
173+ from niaarm.text import Corpus
174+ from niaarm.mine import get_text_rules
175+ from niapy.algorithms.basic import ParticleSwarmOptimization
176+
177+ df = pd.read_json(' datasets/text/artm_test_dataset.json' , orient = ' records' )
178+ documents = df[' text' ].tolist()
179+ corpus = Corpus.from_list(documents)
180+
181+ algorithm = ParticleSwarmOptimization(population_size = 200 , seed = 123 )
182+ metrics = (' support' , ' confidence' , ' aws' )
183+ rules, time = get_text_rules(corpus, max_terms = 5 , algorithm = algorithm, metrics = metrics, max_evals = 10000 , logging = True )
184+
185+ if len (rules):
186+ print (rules)
187+ print (f ' Run time: { time:.2f } s ' )
188+ rules.to_csv(' output.csv' )
189+ else :
190+ print (' No rules generated' )
191+ print (f ' Run time: { time:.2f } s ' )
192+ ```
193+
162194For a full list of examples see the [ examples folder] ( https://github.com/firefly-cpp/NiaARM/tree/main/examples )
163195in the GitHub repository.
164196
@@ -218,6 +250,10 @@ Ideas are based on the following research papers:
218250 In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020.
219251 IDEAL 2020. Lecture Notes in Computer Science(), vol 12489. Springer, Cham. https://doi.org/10.1007/978-3-030-62362-3_10
220252
253+ [ 5] I. Fister, S. Deb, I. Fister, „Population-based metaheuristics for Association Rule Text Mining“,
254+ In: Proceedings of the 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence,
255+ New York, NY, USA, mar. 2020, pp. 19–23. doi: 10.1145/3396474.3396493.
256+
221257## License
222258
223259This package is distributed under the MIT License. This license can be found online at < http://www.opensource.org/licenses/MIT > .
0 commit comments