Skip to content

Commit f8131a7

Browse files
committed
Apriori: implement prune step of apriori-gen
The apriori-gen function described in section 2.1.1 of Apriori paper has two steps; the first step had been implemented in previous commit. The second step of apriori-gen function is called prune step, it takes candidates c from first step and check that all (k-1) tuples built by removing any single element from c is in L(k-1). As Numpy arrays are not hashable, we cannot use set() for itemset lookup, and define a very simple prefix tree class.
1 parent e34ff8c commit f8131a7

File tree

1 file changed

+52
-4
lines changed

1 file changed

+52
-4
lines changed

mlxtend/frequent_patterns/apriori.py

Lines changed: 52 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,44 @@
99
from ..frequent_patterns import fpcommon as fpc
1010

1111

12+
class _FixedLengthTrie:
13+
14+
"""Fixed-length trie (prefix tree).
15+
16+
Parameters
17+
----------
18+
combinations: list of itemsets
19+
All combinations with enough support in the last step
20+
21+
Attributes
22+
----------
23+
root : dict
24+
Root node
25+
"""
26+
__slots__ = ("root")
27+
28+
def __init__(self, combinations):
29+
self.root = dict()
30+
for combination in combinations:
31+
current = self.root
32+
for item in combination:
33+
try:
34+
current = current[item]
35+
except KeyError:
36+
next_node = dict()
37+
current[item] = next_node
38+
current = next_node
39+
40+
def __contains__(self, combination):
41+
current = self.root
42+
try:
43+
for item in combination:
44+
current = current[item]
45+
return True
46+
except KeyError:
47+
return False
48+
49+
1250
def generate_new_combinations(old_combinations):
1351
"""
1452
Generator of all combinations based on the last state of Apriori algorithm
@@ -32,8 +70,7 @@ def generate_new_combinations(old_combinations):
3270
-----------
3371
Generator of combinations based on the last state of Apriori algorithm.
3472
In order to reduce number of candidates, this function implements the
35-
join step of apriori-gen described in section 2.1.1 of Apriori paper.
36-
Prune step is not yet implemented.
73+
apriori-gen function described in section 2.1.1 of Apriori paper.
3774
3875
Examples
3976
-----------
@@ -43,15 +80,26 @@ def generate_new_combinations(old_combinations):
4380
"""
4481

4582
length = len(old_combinations)
83+
trie = _FixedLengthTrie(old_combinations)
4684
for i, old_combination in enumerate(old_combinations):
4785
head_i = list(old_combination[:-1])
4886
j = i + 1
4987
while j < length:
5088
*head_j, tail_j = old_combinations[j]
5189
if head_i != head_j:
5290
break
53-
yield from old_combination
54-
yield tail_j
91+
# Prune old_combination+(item,) if any subset is not frequent
92+
candidate = tuple(old_combination) + (tail_j,)
93+
# No need to check the last two values, because test_candidate
94+
# is then old_combinations[i] and old_combinations[j]
95+
for idx in range(len(candidate) - 2):
96+
test_candidate = list(candidate)
97+
del test_candidate[idx]
98+
if test_candidate not in trie:
99+
# early exit from for-loop skips else clause just below
100+
break
101+
else:
102+
yield from candidate
55103
j = j + 1
56104

57105

0 commit comments

Comments
 (0)