Skip to content

Commit b2a4671

Browse files
committed
opt: Only mark spam at a high confidence to reduce false positive
1 parent 4420ece commit b2a4671

File tree

2 files changed

+12
-1
lines changed

2 files changed

+12
-1
lines changed

app/services/spam_classifier_service.rb

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,16 @@ def classify(message_text)
8989
ham_score += Math.log(ham_count.to_f / (@classifier_state.total_ham_words + vocab_size))
9090
end
9191

92-
is_spam = spam_score > ham_score
92+
diff = spam_score - ham_score
93+
# stable logistic conversion
94+
p_spam = if diff.abs > 700
95+
diff > 0 ? 1.0 : 0.0
96+
else
97+
1.0 / (1.0 + Math.exp(-diff))
98+
end
99+
100+
confidence_threshold = Rails.application.config.probability_threshold
101+
is_spam = p_spam >= confidence_threshold
93102
[ is_spam, spam_score, ham_score ]
94103
end
95104

config/application.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,5 +29,7 @@ class Application < Rails::Application
2929
config.spam_ban_threshold = 3
3030
# Delete the warning message in x minutes to keep chat clean
3131
config.delete_message_delay = 5
32+
# Spam blocked probability threshold
33+
config.probability_threshold = 0.95
3234
end
3335
end

0 commit comments

Comments
 (0)