Skip to content

Commit c904322

Browse files
authored
Merge pull request #11 from jason-liew/update-decision-tree-cal-h-d
move calc_H_D out for loop
2 parents 158a9ca + 42a9943 commit c904322

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

DecisionTree/DecisionTree.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,11 +142,12 @@ def calcBestFeature(trainDataList, trainLabelList):
142142
maxG_D_A = -1
143143
#初始化最大信息增益的特征
144144
maxFeature = -1
145+
#1.计算数据集D的经验熵H(D)
146+
H_D = calc_H_D(trainLabelArr)
145147
#对每一个特征进行遍历计算
146148
for feature in range(featureNum):
147149
#“5.2.2 信息增益”中“算法5.1(信息增益的算法)”第一步:
148-
#1.计算数据集D的经验熵H(D)
149-
H_D = calc_H_D(trainLabelArr)
150+
150151
#2.计算条件经验熵H(D|A)
151152
#由于条件经验熵的计算过程中只涉及到标签以及当前特征,为了提高运算速度(全部样本
152153
#做成的矩阵运算速度太慢,需要剔除不需要的部分),将数据集矩阵进行切割

0 commit comments

Comments
 (0)