Maybe an error in the book 

CHAPTER 11 Association analysis with the Apriori algorithm   page234：
In section 11.3, we quantified an itemset as frequent if it met our minimum support level. We have a similar measurement for association rules. This measurement is
called the confidence. The confidence for a rule P ➞ H is defined as support(P | H)/
support(P). Remember, in Python, the | symbol is the set union; the mathematical
symbol is U. P | H means all the items in set P or in set H. We calculated the support for
all the frequent itemsets in the previous section. Now, when we want to calculate the
confidence, all we have to do is call up those support values and do one divide.

Is the paragraph wrong?I think it's contradictory to Listing 11.3 Association rule-generation functions and page226:The confidence is defined for an association rule like {diapers} ➞ {wine}. The confidence for this rule is defined as support({diapers, wine})/support({diapers}). From
figure 11.1, the support of {diapers, wine} is 3/5. The support for diapers is 4/5, so the
confidence for diapers ➞ wine is 3/4 = 0.75. That means that in 75% of the items in
our dataset containing diapers, our rule is correct.

So I think the right thing is as followed: Page234  The confidence for a rule P ➞ H is defined as support
(P , H)/support(P). Remember, in Python, the , symbol is the set INTERSECTION( not union); the mathematical symbol is n. P , H means all the items in set P AND(not or) in set H.

Am I right or wrong?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maybe an error in the book #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Maybe an error in the book #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions