Skip to content

Maybe an error in the book  #11

@helloworld1973

Description

@helloworld1973

CHAPTER 11 Association analysis with the Apriori algorithm page234:
In section 11.3, we quantified an itemset as frequent if it met our minimum support level. We have a similar measurement for association rules. This measurement is
called the confidence. The confidence for a rule P ➞ H is defined as support(P | H)/
support(P). Remember, in Python, the | symbol is the set union; the mathematical
symbol is U. P | H means all the items in set P or in set H. We calculated the support for
all the frequent itemsets in the previous section. Now, when we want to calculate the
confidence, all we have to do is call up those support values and do one divide.

Is the paragraph wrong?I think it's contradictory to Listing 11.3 Association rule-generation functions and page226:The confidence is defined for an association rule like {diapers} ➞ {wine}. The confidence for this rule is defined as support({diapers, wine})/support({diapers}). From
figure 11.1, the support of {diapers, wine} is 3/5. The support for diapers is 4/5, so the
confidence for diapers ➞ wine is 3/4 = 0.75. That means that in 75% of the items in
our dataset containing diapers, our rule is correct.

So I think the right thing is as followed: Page234 The confidence for a rule P ➞ H is defined as support
(P , H)/support(P). Remember, in Python, the , symbol is the set INTERSECTION( not union); the mathematical symbol is n. P , H means all the items in set P AND(not or) in set H.

Am I right or wrong?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions