Commit 5402416
committed
fix: remove double-scaling of factorWeight in FactorTransferDistillationStrategy
The standard distillation term was being scaled by (1 - _factorWeight) twice:
1. Once when computing softLoss (line 82)
2. Again when multiplying combinedLoss (line 92 for trueLabels case)
This double-scaling incorrectly reduced the soft component by (1-factorWeight)^2
instead of (1-factorWeight).
Fixed by:
- Removing factorWeight from initial softLoss scaling (line 82)
- Computing finalLoss (either combinedLoss or softLoss)
- Applying (1.0 - _factorWeight) scaling exactly once at the end
Same fix applied to ComputeGradient method:
- Removed factorWeight from soft gradient scaling (line 111)
- Removed factorWeight from hard gradient blending (lines 123-124)
- Applied (1.0 - _factorWeight) scaling exactly once at the end
Now factorWeight correctly balances:
- (1 - factorWeight) * standard_distillation
- factorWeight * factor_transfer1 parent 6f8b5a3 commit 5402416
File tree
1 file changed
+19
-8
lines changed- src/KnowledgeDistillation/Strategies
1 file changed
+19
-8
lines changedLines changed: 19 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
| 78 | + | |
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
82 | | - | |
| 82 | + | |
83 | 83 | | |
| 84 | + | |
84 | 85 | | |
85 | 86 | | |
86 | 87 | | |
87 | 88 | | |
88 | 89 | | |
89 | | - | |
| 90 | + | |
90 | 91 | | |
91 | 92 | | |
92 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
93 | 97 | | |
94 | 98 | | |
95 | | - | |
| 99 | + | |
| 100 | + | |
96 | 101 | | |
97 | 102 | | |
98 | 103 | | |
| |||
108 | 113 | | |
109 | 114 | | |
110 | 115 | | |
111 | | - | |
| 116 | + | |
112 | 117 | | |
113 | 118 | | |
114 | 119 | | |
| |||
120 | 125 | | |
121 | 126 | | |
122 | 127 | | |
123 | | - | |
124 | | - | |
| 128 | + | |
| 129 | + | |
125 | 130 | | |
126 | 131 | | |
127 | 132 | | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
128 | 139 | | |
129 | 140 | | |
130 | 141 | | |
| |||
0 commit comments