@@ -9,30 +9,142 @@ class Rule:
99 Args:
1010 antecedent (list[Feature]): A list of antecedents of the association rule.
1111 consequent (list[Feature]): A list of consequents of the association rule.
12- fitness (Optional[float]): Value of the fitness function .
12+ fitness (Optional[float]): Fitness value of the association rule .
1313 transactions (Optional[pandas.DataFrame]): Transactional database.
1414
1515 Attributes:
16- cls.metrics (tuple[str]): List of all available metrics.
17- support (float): Support of the rule i.e. proportion of transactions containing
18- both the antecedent and the consequent.
19- confidence (float): Confidence of the rule, defined as the proportion of transactions that contain
20- the consequent in the set of transactions that contain the antecedent.
21- lift (float): Lift of the rule. Lift measures how many times more often the antecedent and the consequent Y
16+ cls.metrics (tuple[str]): List of all available interest measures.
17+ support: Support is defined on an itemset as the proportion of transactions that contain the attribute :math:`X`.
18+
19+ :math:`supp(X) = \frac{n_{X}}{|D|},`
20+
21+ where :math:`|D|` is the number of records in the transactional database.
22+
23+ For an association rule, support is defined as the support of all the attributes in the rule.
24+
25+ :math:`supp(X \implies Y) = \frac{n_{XY}}{|D|}`
26+
27+ **Range:** :math:`[0, 1]`
28+
29+ **Reference:** Michael Hahsler, A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules,
30+ 2015, URL: https://mhahsler.github.io/arules/docs/measures
31+ confidence: Confidence of the rule, defined as the proportion of transactions that contain
32+ the consequent in the set of transactions that contain the antecedent. This proportion is an estimate
33+ of the probability of seeing the consequent, if the antecedent is present in the transaction.
34+
35+ :math:`conf(X \implies Y) = \frac{supp(X \implies Y)}{supp(X)}`
36+
37+ **Range:** :math:`[0, 1]`
38+
39+ **Reference:** Michael Hahsler, A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules,
40+ 2015, URL: https://mhahsler.github.io/arules/docs/measures
41+ lift: Lift measures how many times more often the antecedent and the consequent Y
2242 occur together than expected if they were statistically independent.
23- coverage (float): Coverage of the rule, also known as antecedent support. It measures the probability that
24- the rule applies to a randomly selected transaction.
25- rhs_support (float): Support of the consequent.
26- conviction (float): Conviction of the rule.
27- inclusion (float): Inclusion of the rule is defined as the ratio between the number of attributes of the rule
28- and all attributes in the dataset.
29- amplitude (float): Amplitude of the rule.
30- interestingness (float): Interestingness of the rule.
31- comprehensibility (float): Comprehensibility of the rule.
32- netconf (float): The netconf metric evaluates the interestingness of
43+
44+ :math:`lift(X \implies Y) = \frac{conf(X \implies Y)}{supp(Y)}`
45+
46+ **Range:** :math:`[0, \infty]` (1 means independence)
47+
48+ **Reference:** Michael Hahsler, A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules,
49+ 2015, URL: https://mhahsler.github.io/arules/docs/measures
50+ coverage: Coverage, also known as antecedent support, is an estimate of the probability that
51+ the rule applies to a randomly selected transaction. It is the proportion of transactions
52+ that contain the antecedent.
53+
54+ :math:`cover(X \implies Y) = supp(X)`
55+
56+ **Range:** :math:`[0, 1]`
57+
58+ **Reference:** Michael Hahsler, A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules,
59+ 2015, URL: https://mhahsler.github.io/arules/docs/measures
60+ rhs_support: Support of the consequent.
61+
62+ :math:`RHSsupp(X \implies Y) = supp(Y)`
63+
64+ **Range:** :math:`[0, 1]`
65+
66+ **Reference:** Michael Hahsler, A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules,
67+ 2015, URL: https://mhahsler.github.io/arules/docs/measures
68+ conviction: Conviction can be interpreted as the ratio of the expected frequency that the antecedent occurs without
69+ the consequent.
70+
71+ :math:`conv(X \implies Y) = \frac{1 - supp(Y)}{1 - conf(X \implies Y)}`
72+
73+ **Range:** :math:`[0, \infty]` (1 means independence, :math:`\infty` means the rule always holds)
74+
75+ **Reference:** Michael Hahsler, A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules,
76+ 2015, URL: https://mhahsler.github.io/arules/docs/measures
77+ inclusion: Inclusion is defined as the ratio between the number of attributes of the rule
78+ and all attributes in the database.
79+
80+ :math:`inclusion(X \implies Y) = \frac{|X \cup Y|}{m},`
81+
82+ where :math:`m` is the total number of attributes in the transactional database.
83+
84+
85+ **Range:** :math:`[0, 1]`
86+
87+ **Reference:** I. Fister Jr., V. Podgorelec, I. Fister. Improved Nature-Inspired Algorithms for Numeric Association
88+ Rule Mining. In: Vasant P., Zelinka I., Weber GW. (eds) Intelligent Computing and Optimization. ICO 2020. Advances in
89+ Intelligent Systems and Computing, vol 1324. Springer, Cham.
90+ amplitude: Amplitude measures the quality of a rule, preferring attributes with smaller intervals.
91+
92+ :math:`ampl(X \implies Y) = 1 - \frac{1}{n}\sum_{k = 1}^{n}{\frac{Ub_k - Lb_k}{max(o_k) - min(o_k)}},`
93+
94+ where :math:`n` is the total number of attributes in the rule, :math:`Ub_k` and :math:`Lb_k` are upper and lower
95+ bounds of the selected attribute, and :math:`max(o_k)` and :math:`min(o_k)` are the maximum and minimum
96+ feasible values of the attribute :math:`o_k` in the transactional database.
97+
98+ **Range:** :math:`[0, 1]`
99+
100+ **Reference:** I. Fister Jr., I. Fister A brief overview of swarm intelligence-based algorithms for numerical
101+ association rule mining. arXiv preprint arXiv:2010.15524 (2020).
102+ interestingness: Interestingness of the rule, defined as:
103+
104+ :math:`interest(X \implies Y) = \frac{supp(X \implies Y)}{supp(X)} \cdot \frac{supp(X \implies Y)}{supp(Y)}
105+ \cdot (1 - \frac{supp(X \implies Y)}{|D|})`
106+
107+ Here, the first part gives us the probability of generating the rule based on the antecedent, the second part
108+ gives us the probability of generating the rule based on the consequent and the third part is the probability
109+ that the rule won't be generated. Thus, rules with very high support will be deemed uninteresting.
110+
111+ **Range:** :math:`[0, 1]`
112+
113+ **Reference:** I. Fister Jr., I. Fister A brief overview of swarm intelligence-based algorithms for numerical
114+ association rule mining. arXiv preprint arXiv:2010.15524 (2020).
115+ comprehensibility: Comprehensibility of the rule. Rules with fewer attributes in the consequent are more
116+ comprehensible.
117+
118+ :math:`comp(X \implies Y) = \frac{log(1 + |Y|)}{log(1 + |X \cup Y|)}`
119+
120+ **Range:** :math:`[0, 1]`
121+
122+ **Reference:** I. Fister Jr., I. Fister A brief overview of swarm intelligence-based algorithms for numerical
123+ association rule mining. arXiv preprint arXiv:2010.15524 (2020).
124+ netconf: The netconf metric evaluates the interestingness of
33125 association rules depending on the support of the rule and the
34126 support of the antecedent and consequent of the rule.
35- yulesq (float): Yule's Q metric.
127+
128+ :math:`netconf(X \implies Y) = \frac{supp(X \implies Y) - supp(X)supp(Y)}{supp(X)(1 - supp(X))}`
129+
130+ **Range:** :math:`[-1, 1]` (Negative values represent negative dependence, positive values represent positive
131+ dependence and 0 represents independence)
132+
133+ **Reference:** E. V. Altay and B. Alatas, "Sensitivity Analysis of MODENAR Method for Mining of Numeric Association
134+ Rules," 2019 1st International Informatics and Software Engineering Conference (UBMYK), 2019, pp. 1-6,
135+ doi: 10.1109/UBMYK48245.2019.8965539.
136+ yulesq: The Yule's Q metric represents the correlation between two possibly related dichotomous events.
137+
138+ :math:`yulesq(X \implies Y) =
139+ \frac{supp(X \implies Y)supp(\neg X \implies \neg Y) - supp(X \implies \neg Y)supp(\neg X \implies Y)}
140+ {supp(X \implies Y)supp(\neg X \implies \neg Y) + supp(X \implies \neg Y)supp(\neg X \implies Y)}`
141+
142+ **Range:** :math:`[-1, 1]` (-1 reflects total negative association, 1 reflects perfect positive association
143+ and 0 reflects independence)
144+
145+ **Reference:** E. V. Altay and B. Alatas, "Sensitivity Analysis of MODENAR Method for Mining of Numeric Association
146+ Rules," 2019 1st International Informatics and Software Engineering Conference (UBMYK), 2019, pp. 1-6,
147+ doi: 10.1109/UBMYK48245.2019.8965539.
36148
37149 """
38150
0 commit comments