Commit 5df002d
committed
[megatron] feat: Share actor and ref in LoRA
For `compute_ref_log_prob`, we can do that by disabling
lora layers temporarily for the forward pass, as base
weight are frozen and only lora layers are trained.
This has already been supported in FSDP LoRA.
Signed-off-by: Hollow Man <hollowman@opensuse.org>1 parent 5d0eac0 commit 5df002d
File tree
5 files changed
+41
-46
lines changed- recipe
- fully_async_policy
- one_step_off_policy
- transfer_queue
- verl
- trainer/ppo
- workers
5 files changed
+41
-46
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
84 | 87 | | |
85 | 88 | | |
86 | 89 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
| 40 | + | |
45 | 41 | | |
46 | 42 | | |
47 | 43 | | |
| |||
54 | 50 | | |
55 | 51 | | |
56 | 52 | | |
57 | | - | |
58 | | - | |
59 | | - | |
| 53 | + | |
60 | 54 | | |
61 | 55 | | |
62 | 56 | | |
| |||
120 | 114 | | |
121 | 115 | | |
122 | 116 | | |
123 | | - | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
124 | 121 | | |
125 | 122 | | |
126 | 123 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
| 51 | + | |
56 | 52 | | |
57 | 53 | | |
58 | 54 | | |
| |||
64 | 60 | | |
65 | 61 | | |
66 | 62 | | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
| 63 | + | |
| 64 | + | |
78 | 65 | | |
79 | 66 | | |
80 | 67 | | |
81 | 68 | | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
| 69 | + | |
87 | 70 | | |
88 | 71 | | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
| 72 | + | |
94 | 73 | | |
95 | 74 | | |
96 | 75 | | |
| |||
401 | 380 | | |
402 | 381 | | |
403 | 382 | | |
404 | | - | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
405 | 387 | | |
406 | 388 | | |
407 | 389 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
348 | 348 | | |
349 | 349 | | |
350 | 350 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
| |||
816 | 818 | | |
817 | 819 | | |
818 | 820 | | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
819 | 825 | | |
820 | 826 | | |
821 | 827 | | |
| |||
842 | 848 | | |
843 | 849 | | |
844 | 850 | | |
| 851 | + | |
| 852 | + | |
845 | 853 | | |
846 | | - | |
847 | | - | |
848 | | - | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
849 | 858 | | |
850 | 859 | | |
851 | 860 | | |
| |||
854 | 863 | | |
855 | 864 | | |
856 | 865 | | |
857 | | - | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
858 | 871 | | |
859 | | - | |
| 872 | + | |
860 | 873 | | |
861 | 874 | | |
862 | 875 | | |
| |||
0 commit comments