Commit 2b8020f
committed
Creating medusa2.
Turns out creating entire weights for the lm_heads costs a huge amount
of VRAM (specially for multilingual models like Gemm) and is not
necessary at all to get good speculation.
This PR modifies the legacy code to create new medusa models without
duplicating this lm_head making it much more efficient to run.
It also increments the version number of the config
so users can know if how to actually run the model.1 parent 5e98053 commit 2b8020f
File tree
3 files changed
+27
-23
lines changed- medusa
- model
- train
3 files changed
+27
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | | - | |
45 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| |||
62 | 63 | | |
63 | 64 | | |
64 | 65 | | |
65 | | - | |
| 66 | + | |
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| 35 | + | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
| |||
101 | 103 | | |
102 | 104 | | |
103 | 105 | | |
104 | | - | |
105 | 106 | | |
106 | 107 | | |
107 | 108 | | |
| |||
110 | 111 | | |
111 | 112 | | |
112 | 113 | | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | 114 | | |
121 | 115 | | |
122 | 116 | | |
| |||
207 | 201 | | |
208 | 202 | | |
209 | 203 | | |
210 | | - | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
211 | 207 | | |
212 | 208 | | |
213 | 209 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
335 | 335 | | |
336 | 336 | | |
337 | 337 | | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
338 | 352 | | |
339 | 353 | | |
340 | 354 | | |
| |||
358 | 372 | | |
359 | 373 | | |
360 | 374 | | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | 375 | | |
370 | 376 | | |
371 | 377 | | |
| |||
375 | 381 | | |
376 | 382 | | |
377 | 383 | | |
| 384 | + | |
378 | 385 | | |
379 | 386 | | |
380 | 387 | | |
| |||
0 commit comments