Commit 606ca4e
Option to return hard and soft loss when using distillation (#895)
## Summary
Proposition to add `return_soft_hard_loss` parameter to enable logging
of soft and hard losses separately. Useful for monitoring and analysis
during training
## Testing Done
- [x] test_jsd_loss.py
- [x] test_cosine_loss.py
---------
Co-authored-by: Shao Tang <tangshao28@gmail.com>1 parent d5648bf commit 606ca4e
File tree
3 files changed
+44
-11
lines changed- src/liger_kernel/chunked_loss
3 files changed
+44
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
1 | 4 | | |
2 | 5 | | |
3 | 6 | | |
| |||
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
44 | | - | |
| 47 | + | |
| 48 | + | |
45 | 49 | | |
46 | 50 | | |
47 | 51 | | |
| |||
59 | 63 | | |
60 | 64 | | |
61 | 65 | | |
| 66 | + | |
62 | 67 | | |
63 | 68 | | |
64 | 69 | | |
65 | | - | |
66 | | - | |
| 70 | + | |
| 71 | + | |
67 | 72 | | |
68 | 73 | | |
69 | 74 | | |
| |||
75 | 80 | | |
76 | 81 | | |
77 | 82 | | |
| 83 | + | |
78 | 84 | | |
79 | 85 | | |
80 | 86 | | |
| |||
88 | 94 | | |
89 | 95 | | |
90 | 96 | | |
| 97 | + | |
91 | 98 | | |
92 | 99 | | |
93 | 100 | | |
| |||
98 | 105 | | |
99 | 106 | | |
100 | 107 | | |
| 108 | + | |
101 | 109 | | |
102 | 110 | | |
103 | 111 | | |
| |||
108 | 116 | | |
109 | 117 | | |
110 | 118 | | |
111 | | - | |
| 119 | + | |
112 | 120 | | |
113 | 121 | | |
114 | 122 | | |
| |||
124 | 132 | | |
125 | 133 | | |
126 | 134 | | |
| 135 | + | |
127 | 136 | | |
Lines changed: 13 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
3 | 5 | | |
4 | 6 | | |
5 | 7 | | |
| |||
157 | 159 | | |
158 | 160 | | |
159 | 161 | | |
| 162 | + | |
160 | 163 | | |
161 | | - | |
| 164 | + | |
162 | 165 | | |
163 | 166 | | |
164 | 167 | | |
| |||
180 | 183 | | |
181 | 184 | | |
182 | 185 | | |
| 186 | + | |
183 | 187 | | |
184 | 188 | | |
185 | 189 | | |
186 | 190 | | |
187 | 191 | | |
188 | 192 | | |
189 | 193 | | |
| 194 | + | |
| 195 | + | |
190 | 196 | | |
191 | 197 | | |
192 | 198 | | |
| |||
247 | 253 | | |
248 | 254 | | |
249 | 255 | | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
250 | 259 | | |
251 | 260 | | |
252 | 261 | | |
| |||
268 | 277 | | |
269 | 278 | | |
270 | 279 | | |
| 280 | + | |
| 281 | + | |
271 | 282 | | |
272 | 283 | | |
273 | 284 | | |
274 | | - | |
| 285 | + | |
275 | 286 | | |
276 | 287 | | |
277 | 288 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
3 | 6 | | |
4 | 7 | | |
5 | 8 | | |
| |||
56 | 59 | | |
57 | 60 | | |
58 | 61 | | |
| 62 | + | |
59 | 63 | | |
60 | 64 | | |
61 | 65 | | |
| |||
72 | 76 | | |
73 | 77 | | |
74 | 78 | | |
| 79 | + | |
75 | 80 | | |
76 | | - | |
| 81 | + | |
77 | 82 | | |
78 | 83 | | |
79 | 84 | | |
| |||
92 | 97 | | |
93 | 98 | | |
94 | 99 | | |
| 100 | + | |
95 | 101 | | |
96 | 102 | | |
97 | 103 | | |
98 | | - | |
99 | | - | |
| 104 | + | |
| 105 | + | |
100 | 106 | | |
101 | 107 | | |
102 | 108 | | |
| |||
108 | 114 | | |
109 | 115 | | |
110 | 116 | | |
| 117 | + | |
111 | 118 | | |
112 | 119 | | |
113 | 120 | | |
| |||
125 | 132 | | |
126 | 133 | | |
127 | 134 | | |
| 135 | + | |
128 | 136 | | |
129 | 137 | | |
130 | 138 | | |
| |||
135 | 143 | | |
136 | 144 | | |
137 | 145 | | |
| 146 | + | |
138 | 147 | | |
139 | 148 | | |
140 | 149 | | |
| |||
145 | 154 | | |
146 | 155 | | |
147 | 156 | | |
| 157 | + | |
148 | 158 | | |
149 | 159 | | |
150 | 160 | | |
| |||
155 | 165 | | |
156 | 166 | | |
157 | 167 | | |
158 | | - | |
| 168 | + | |
159 | 169 | | |
160 | 170 | | |
161 | 171 | | |
| |||
167 | 177 | | |
168 | 178 | | |
169 | 179 | | |
170 | | - | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
171 | 183 | | |
172 | 184 | | |
173 | 185 | | |
| |||
184 | 196 | | |
185 | 197 | | |
186 | 198 | | |
| 199 | + | |
187 | 200 | | |
0 commit comments