Skip to content

Commit 671bfd1

Browse files
authored
Merge pull request #43 from zStupan/main
Updated RuleList
2 parents 881e573 + 3b0dc3d commit 671bfd1

File tree

4 files changed

+111
-148
lines changed

4 files changed

+111
-148
lines changed

README.md

Lines changed: 52 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
[![Average time to resolve an issue](http://isitmaintained.com/badge/resolution/firefly-cpp/niaarm.svg)](http://isitmaintained.com/project/firefly-cpp/niaarm "Average time to resolve an issue")
1616

1717
## General outline of the framework
18-
NiaARM is a framework for Association Rule Mining based on nature-inspired algorithms for optimization. The framework is written fully in Python and runs on all platforms. NiaARM allows users to preprocess the data in a transaction database automatically, to search for association rules and provide a pretty output of the rules found. This framework also supports numerical and real-valued types of attributes besides the categorical ones. Mining the association rules is defined as an optimization problem, and solved using the nature-inspired algorithms that come from the related framework called [NiaPy](https://github.com/NiaOrg/NiaPy).
18+
NiaARM is a framework for Association Rule Mining based on nature-inspired algorithms for optimization. The framework is written fully in Python and runs on all platforms. NiaARM allows users to preprocess the data in a transaction database automatically, to search for association rules and provide a pretty output of the rules found. This framework also supports integral and real-valued types of attributes besides the categorical ones. Mining the association rules is defined as an optimization problem, and solved using the nature-inspired algorithms that come from the related framework called [NiaPy](https://github.com/NiaOrg/NiaPy).
1919

2020
## Detailed insights
2121
The current version includes (but is not limited to) the following functions:
@@ -44,19 +44,63 @@ $ apk add py3-niaarm
4444

4545
## Usage
4646

47-
### Basic example
47+
### Loading data
4848

49-
In this example we'll use Differential Evolution to mine association rules on the Abalone Dataset.
49+
In NiaARM, data loading is done via the `Dataset` class. There are two options for loading data:
50+
51+
#### Option 1: From a pandas DataFrame (recommended)
52+
53+
```python
54+
import pandas as pd
55+
from niaarm import Dataset
56+
57+
58+
df = pd.read_csv('datasets/Abalone.csv')
59+
# preprocess data...
60+
data = Dataset(df)
61+
print(data) # printing the dataset will generate a feature report
62+
```
63+
64+
#### Option 2: From CSV file directly
65+
66+
```python
67+
from niaarm import Dataset
68+
69+
70+
data = Dataset('datasets/Abalone.csv')
71+
print(data)
72+
```
73+
74+
### Mining association rules the easy way (recommended)
75+
76+
Association rule mining can be easily performed using the `get_rules` function:
77+
78+
```python
79+
80+
from niaarm import get_rules
81+
from niapy.algorithms.basic import DifferentialEvolution
82+
83+
algo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)
84+
metrics = ('support', 'confidence')
85+
86+
rules, run_time = get_rules(data, algo, metrics, max_iters=30, logging=True)
87+
88+
print(rules) # Prints basic stats about the mined rules
89+
print(f'Run Time: {run_time}')
90+
rules.to_csv('output.csv')
91+
```
92+
93+
### Mining association rules the hard way
94+
95+
The above example can be also be implemented using a more low level interface,
96+
with the `NiaARM` class directly:
5097

5198
```python
5299
from niaarm import NiaARM, Dataset
53100
from niapy.algorithms.basic import DifferentialEvolution
54101
from niapy.task import Task, OptimizationType
55102

56103

57-
# load and preprocess the dataset from csv
58-
data = Dataset("datasets/Abalone.csv")
59-
60104
# Create a problem:::
61105
# dimension represents the dimension of the problem;
62106
# features represent the list of features, while transactions depicts the list of transactions
@@ -82,29 +126,8 @@ problem.rules.sort()
82126
problem.rules.to_csv('output.csv')
83127
```
84128

85-
#### Simplified
86-
87-
The above example can be further simplified with the use of ``niaarm.mine.get_rules()``:
88-
89-
```python
90-
91-
from niaarm import Dataset, get_rules
92-
from niapy.algorithms.basic import DifferentialEvolution
93-
94-
95-
data = Dataset("datasets/Abalone.csv")
96-
algo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)
97-
metrics = ('support', 'confidence')
98-
99-
rules, run_time = get_rules(data, algo, metrics, max_iters=30, logging=True)
100-
101-
print(rules)
102-
print(f'Run Time: {run_time}')
103-
rules.to_csv('output.csv')
104-
105-
```
106-
107-
For a full list of examples see the [examples folder](examples/).
129+
For a full list of examples see the [examples folder](https://github.com/firefly-cpp/NiaARM/tree/main/examples)
130+
in the GitHub repository.
108131

109132
### Command line interface
110133

examples/stats.py

Lines changed: 0 additions & 21 deletions
This file was deleted.

examples/working_with_rule_list.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
"""
2+
Example usage of the RuleList class. The RuleList class is a wrapper around a python list, with some added features, mainly
3+
getting statistical data of rule metrics and sorting by metric.
4+
"""
5+
6+
7+
from niaarm import NiaARM, Dataset
8+
from niapy.algorithms.basic import DifferentialEvolution
9+
from niapy.task import Task, OptimizationType
10+
11+
12+
if __name__ == '__main__':
13+
# Load the dataset and run the algorithm
14+
data = Dataset("datasets/Abalone.csv")
15+
problem = NiaARM(data.dimension, data.features, data.transactions, metrics=('support', 'confidence'))
16+
task = Task(problem=problem, max_iters=30, optimization_type=OptimizationType.MAXIMIZATION)
17+
algo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)
18+
algo.run(task=task)
19+
20+
# print the RuleList to get basic data about the mined rules.
21+
print(problem.rules)
22+
23+
# RuleList also provides methods for getting the min, max, mean and std. dev. of metrics:
24+
print('Min support', problem.rules.min('support'))
25+
print('Max support', problem.rules.max('support'))
26+
print('Mean support', problem.rules.mean('support'))
27+
print('Std support', problem.rules.std('support'))
28+
29+
# you can also use RuleList.get to get all values of a metric as a numpy array:
30+
print(problem.rules.get('support'))

niaarm/rule_list.py

Lines changed: 29 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -5,33 +5,26 @@
55

66

77
class RuleList(UserList):
8-
"""A wrapper around a list of rules.
9-
10-
Attributes:
11-
mean_fitness (float): Mean fitness.
12-
mean_support (float): Mean support.
13-
mean_confidence (float): Mean confidence.
14-
mean_lift (float): Mean lift.
15-
mean_coverage (float): Mean coverage.
16-
mean_rhs_support (float): Mean consequent support.
17-
mean_conviction (float): Mean conviction.
18-
mean_inclusion (float): Mean inclusion.
19-
mean_amplitude (float): Mean amplitude.
20-
mean_interestingness (float): Mean interestingness.
21-
mean_comprehensibility (float): Mean comprehensibility.
22-
mean_netconf (float): Mean netconf.
23-
mean_yulesq (float): Mean Yule's Q.
24-
mean_antecedent_length (float): Mean antecedent length.
25-
mean_consequent_length (float): Mean consequent length.
26-
27-
"""
8+
"""A list of rules."""
9+
10+
def get(self, metric):
11+
"""Get values of `metric` for each rule as a numpy array.
12+
13+
Args:
14+
metric (str): Metric.
15+
16+
Returns:
17+
numpy.ndarray: Array of `metric` for all rules.
18+
19+
"""
20+
return np.array([getattr(rule, metric) for rule in self.data])
2821

2922
def sort(self, by='fitness', reverse=True):
3023
"""Sort rules by metric.
3124
3225
Args:
3326
by (str): Metric to sort rules by. Default: ``'fitness'``.
34-
reverse (bool): Sort in descending order. Default: ``True``
27+
reverse (bool): Sort in descending order. Default: ``True``.
3528
3629
"""
3730
self.data.sort(key=lambda rule: getattr(rule, by), reverse=reverse)
@@ -46,7 +39,7 @@ def mean(self, metric):
4639
float: Mean value of metric in rule list.
4740
4841
"""
49-
return np.mean([getattr(rule, metric) for rule in self.data])
42+
return sum(getattr(rule, metric) for rule in self.data) / len(self.data)
5043

5144
def min(self, metric):
5245
"""Get min value of metric.
@@ -97,87 +90,25 @@ def to_csv(self, filename):
9790
# write header
9891
writer.writerow(("antecedent", "consequent", "fitness") + Rule.metrics)
9992

100-
for rule in self:
93+
for rule in self.data:
10194
writer.writerow(
10295
[rule.antecedent, rule.consequent, rule.fitness] + [getattr(rule, metric) for metric in Rule.metrics])
10396
print(f"Rules exported to {filename}")
10497

105-
@property
106-
def mean_fitness(self):
107-
return np.mean([rule.fitness for rule in self.data])
108-
109-
@property
110-
def mean_support(self):
111-
return np.mean([rule.support for rule in self.data])
112-
113-
@property
114-
def mean_confidence(self):
115-
return np.mean([rule.confidence for rule in self.data])
116-
117-
@property
118-
def mean_lift(self):
119-
return np.mean([rule.lift for rule in self.data])
120-
121-
@property
122-
def mean_coverage(self):
123-
return np.mean([rule.coverage for rule in self.data])
124-
125-
@property
126-
def mean_rhs_support(self):
127-
return np.mean([rule.rhs_support for rule in self.data])
128-
129-
@property
130-
def mean_conviction(self):
131-
return np.mean([rule.conviction for rule in self.data])
132-
133-
@property
134-
def mean_inclusion(self):
135-
return np.mean([rule.inclusion for rule in self.data])
136-
137-
@property
138-
def mean_amplitude(self):
139-
return np.mean([rule.amplitude for rule in self.data])
140-
141-
@property
142-
def mean_interestingness(self):
143-
return np.mean([rule.interestingness for rule in self.data])
144-
145-
@property
146-
def mean_comprehensibility(self):
147-
return np.mean([rule.comprehensibility for rule in self.data])
148-
149-
@property
150-
def mean_netconf(self):
151-
return np.mean([rule.netconf for rule in self.data])
152-
153-
@property
154-
def mean_yulesq(self):
155-
return np.mean([rule.yulesq for rule in self.data])
156-
157-
@property
158-
def mean_antecedent_length(self):
159-
return np.mean([len(rule.antecedent) for rule in self.data])
160-
161-
@property
162-
def mean_consequent_length(self):
163-
return np.mean([len(rule.consequent) for rule in self.data])
164-
16598
def __str__(self):
16699
string = f'STATS:\n' \
167100
f'Total rules: {len(self)}\n' \
168-
f'Average fitness: {self.mean_fitness}\n' \
169-
f'Average support: {self.mean_support}\n' \
170-
f'Average confidence: {self.mean_confidence}\n' \
171-
f'Average lift: {self.mean_lift}\n' \
172-
f'Average coverage: {self.mean_coverage}\n' \
173-
f'Average consequent support: {self.mean_rhs_support}\n' \
174-
f'Average conviction: {self.mean_conviction}\n' \
175-
f'Average amplitude: {self.mean_amplitude}\n' \
176-
f'Average inclusion: {self.mean_inclusion}\n' \
177-
f'Average interestingness: {self.mean_interestingness}\n' \
178-
f'Average comprehensibility: {self.mean_comprehensibility}\n' \
179-
f'Average netconf: {self.mean_netconf}\n' \
180-
f'Average Yule\'s Q: {self.mean_yulesq}\n' \
181-
f'Average length of antecedent: {self.mean_antecedent_length}\n' \
182-
f'Average length of consequent: {self.mean_consequent_length}'
101+
f'Average fitness: {self.mean("fitness")}\n' \
102+
f'Average support: {self.mean("support")}\n' \
103+
f'Average confidence: {self.mean("confidence")}\n' \
104+
f'Average lift: {self.mean("lift")}\n' \
105+
f'Average coverage: {self.mean("coverage")}\n' \
106+
f'Average consequent support: {self.mean("rhs_support")}\n' \
107+
f'Average conviction: {self.mean("conviction")}\n' \
108+
f'Average amplitude: {self.mean("amplitude")}\n' \
109+
f'Average inclusion: {self.mean("inclusion")}\n' \
110+
f'Average interestingness: {self.mean("interestingness")}\n' \
111+
f'Average comprehensibility: {self.mean("comprehensibility")}\n' \
112+
f'Average netconf: {self.mean("netconf")}\n' \
113+
f'Average Yule\'s Q: {self.mean("yulesq")}\n'
183114
return string

0 commit comments

Comments
 (0)