Skip to content

Commit 7f3d3da

Browse files
authored
[scripts] Fixed the possible zero discounting constant issue in make_kn_lm.py (#4687)
1 parent 4609ea1 commit 7f3d3da

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

egs/wsj/s5/utils/lang/make_kn_lm.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,9 @@ def cal_discounting_constants(self):
165165
n1 += stat[1]
166166
n2 += stat[2]
167167
assert n1 + 2 * n2 > 0
168-
self.d.append(n1 * 1.0 / (n1 + 2 * n2))
168+
self.d.append(max(0.001, n1 * 1.0) / (n1 + 2 * n2)) # We are doing this max(0.001, xxx) to avoid zero discounting constant D due to n1=0,
169+
# which could happen if the number of symbols is small.
170+
# Otherwise, zero discounting constant can cause division by zero in computing BOW.
169171

170172
def cal_f(self):
171173
# f(a_z) is a probability distribution of word sequence a_z.

0 commit comments

Comments
 (0)