Skip to content

Commit 4349b20

Browse files
committed
Further improve generate_new_combinations
The apriori-gen function described in section 2.1.1 of Apriori paper has two steps; the first step had been implemented in 96dfd4d. The second step of apriori-gen function is called prune step, it takes candidates c from first step and check that all (k-1) tuples built by removing any single element from c is in L(k-1). Efficient lookups require some dedicated data structure; Apriori paper describes how to do it with hash-trees; it is also possible to use prefix trees (also knows as tries). This commit uses third-party pygtrie module, to check whether this step provides performance improvements in our case. It can then be decided to either keep this import or write a stripped down implementation.
1 parent 712c8d4 commit 4349b20

File tree

1 file changed

+12
-2
lines changed

1 file changed

+12
-2
lines changed

mlxtend/frequent_patterns/apriori.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
import numpy as np
88
import pandas as pd
9+
import pygtrie
910
from ..frequent_patterns import fpcommon as fpc
1011

1112

@@ -43,15 +44,24 @@ def generate_new_combinations(old_combinations):
4344
"""
4445

4546
length = len(old_combinations)
47+
trie = pygtrie.Trie(list(zip(old_combinations, [1]*length)))
4648
for i, old_combination in enumerate(old_combinations):
4749
*head_i, _ = old_combination
4850
j = i + 1
4951
while j < length:
5052
*head_j, tail_j = old_combinations[j]
5153
if head_i != head_j:
5254
break
53-
yield from old_combination
54-
yield tail_j
55+
# Prune old_combination+(item,) if any subset is not frequent
56+
candidate = tuple(old_combination) + (tail_j,)
57+
for idx in range(len(candidate)):
58+
test_candidate = list(candidate)
59+
del test_candidate[idx]
60+
if tuple(test_candidate) not in trie:
61+
# early exit from for-loop skips else clause just below
62+
break
63+
else:
64+
yield from candidate
5565
j = j + 1
5666

5767

0 commit comments

Comments
 (0)