Skip to content

[FIX] Naive Bayes: Ignore existing classes in Laplacian smoothing#3575

Merged
lanzagar merged 1 commit intobiolab:masterfrom
janezd:naive-bayes-ignore-empty-classes
Feb 15, 2019
Merged

[FIX] Naive Bayes: Ignore existing classes in Laplacian smoothing#3575
lanzagar merged 1 commit intobiolab:masterfrom
janezd:naive-bayes-ignore-empty-classes

Conversation

@janezd
Copy link
Contributor

@janezd janezd commented Feb 2, 2019

Issue

As suggested by @lanzagar, issue #2943 can be fixed by ignoring empty classes in Laplacian smoothing.

Plainly, say that y has values male, female, yes and no because y appeared in two different data sets loaded in the same session. Of course, only two values appear in each set. With this PR, only 2 is added to denominators in computation of probabilities, and the probabilities of empty classes are set to 0.

Variable reuse remains a large open issue.

Description of changes

The change also required a minor change in computation of probabilities: instead of computing exp(log(class_probs) + sum(conditional_probs)), the code now computes class_probs * exp(sum(conditional_probls)) because class_probs can now be 0. This should not affect numerical stability.

Includes
  • Code changes
  • Tests

@janezd janezd force-pushed the naive-bayes-ignore-empty-classes branch from 325575e to 6fcd7cf Compare February 2, 2019 15:45
@codecov
Copy link

codecov bot commented Feb 2, 2019

Codecov Report

Merging #3575 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3575      +/-   ##
==========================================
+ Coverage   83.97%   83.97%   +<.01%     
==========================================
  Files         370      370              
  Lines       66944    66945       +1     
==========================================
+ Hits        56213    56214       +1     
  Misses      10731    10731

@janezd janezd force-pushed the naive-bayes-ignore-empty-classes branch from 6fcd7cf to de273b1 Compare February 2, 2019 16:52
@janezd janezd force-pushed the naive-bayes-ignore-empty-classes branch from de273b1 to 5e0fa06 Compare February 2, 2019 17:56
@janezd janezd changed the title Naive Bayes: Ignore existing classes in Laplacian smoothing [RFC] Naive Bayes: Ignore existing classes in Laplacian smoothing Feb 2, 2019
@lanzagar
Copy link
Contributor

Looks good to me.
It still has the [RFC] tag, but unless others have additional comments, I am fine with changing this to [FIX] and merging.

@janezd janezd changed the title [RFC] Naive Bayes: Ignore existing classes in Laplacian smoothing Naive Bayes: Ignore existing classes in Laplacian smoothing Feb 15, 2019
@lanzagar lanzagar changed the title Naive Bayes: Ignore existing classes in Laplacian smoothing [FIX] Naive Bayes: Ignore existing classes in Laplacian smoothing Feb 15, 2019
@lanzagar lanzagar merged commit 5fd880c into biolab:master Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants