Commit 114151a
authored
[SAC] Centralize selective AC policy and remove per-model op save lists (#2357)
### Summary
- Remove layer-frequency selective activation checkpointing
(`selective_ac_option` and `_layer_sac_count`) — per-op SAC is now the
only selective mode
- Centralize the op save list into
`default_activation_checkpoint_policy()` in `activation_checkpoint.py`,
removing duplicated `_op_sac_save_list` sets from per-model
`parallelize.py` files (llama3, llama4, deepseek_v3, qwen3, gpt_oss,
graph_trainer)
- Remove the `op_sac_save_list` parameter from `apply_ac` — models no
longer need to pass their own op sets
- Build the centralized policy from `get_default_op_list()` (upstream
PyTorch) plus explicit compute ops (SDPA, FlexAttention, inductor,
varlen_attn) and communication ops (reduce_scatter, all_to_all, deepep,
hybridep), with conditional resolution for optional dependencies
- Use `@lru_cache` with `cache_hash` on the policy factory for dynamo
recompilation avoidance and AOTAutograd cache compatibility
- Add `--activation_checkpoint.mode full` to PP integration tests
(`InterleavedZeroBubble`, `ZBVZeroBubble`, `PipelineScheduleMulti`)
since they relied on layer_sac
- Clean deepep imports, now we import from
`torchtitan.distirbuted.deepep.deepep` or
`torchtitan.distirbuted.deepep.hybridep`, to keep them symmetrical.
### Test
Added `test_force_recompute_mm_fqns`: verifies that
`per_op_sac_force_recompute_mm_shapes_by_fqns` controls exactly which
matmuls are recomputed vs stored during backward. Uses a
TorchDispatchMode tracker to count aten.mm calls per weight tensor1 parent 1f02964 commit 114151a
File tree
27 files changed
+158
-324
lines changed- docs
- tests
- integration_tests
- unit_tests
- torchtitan
- config
- distributed
- deepep
- experiments
- autoparallel
- deepseek_v3
- llama3
- local_map_deepseek_v3
- ft/llama3
- transformers_modeling_backend
- vlm
- infra
- models
- common/moe
- deepseek_v3
- gpt_oss
- llama3
- llama4
- qwen3
27 files changed
+158
-324
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
163 | | - | |
| 163 | + | |
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
57 | 56 | | |
58 | 57 | | |
59 | 58 | | |
| |||
148 | 147 | | |
149 | 148 | | |
150 | 149 | | |
| 150 | + | |
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| |||
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
| 162 | + | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| |||
282 | 283 | | |
283 | 284 | | |
284 | 285 | | |
| 286 | + | |
285 | 287 | | |
286 | 288 | | |
287 | 289 | | |
| |||
507 | 509 | | |
508 | 510 | | |
509 | 511 | | |
510 | | - | |
511 | 512 | | |
512 | 513 | | |
513 | 514 | | |
| |||
520 | 521 | | |
521 | 522 | | |
522 | 523 | | |
523 | | - | |
524 | 524 | | |
525 | 525 | | |
526 | 526 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
88 | 88 | | |
89 | 89 | | |
90 | 90 | | |
91 | | - | |
92 | 91 | | |
93 | 92 | | |
94 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
11 | | - | |
12 | | - | |
13 | 10 | | |
14 | 11 | | |
15 | 12 | | |
16 | 13 | | |
17 | 14 | | |
18 | 15 | | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | 16 | | |
38 | 17 | | |
39 | 18 | | |
| |||
84 | 63 | | |
85 | 64 | | |
86 | 65 | | |
87 | | - | |
88 | 66 | | |
89 | 67 | | |
90 | 68 | | |
91 | 69 | | |
92 | 70 | | |
93 | 71 | | |
94 | 72 | | |
95 | | - | |
96 | 73 | | |
97 | 74 | | |
98 | 75 | | |
| |||
101 | 78 | | |
102 | 79 | | |
103 | 80 | | |
104 | | - | |
105 | 81 | | |
106 | 82 | | |
107 | 83 | | |
108 | 84 | | |
109 | 85 | | |
110 | 86 | | |
111 | 87 | | |
112 | | - | |
113 | 88 | | |
114 | 89 | | |
115 | 90 | | |
116 | 91 | | |
117 | 92 | | |
118 | 93 | | |
119 | 94 | | |
120 | | - | |
121 | 95 | | |
122 | 96 | | |
123 | 97 | | |
124 | 98 | | |
125 | 99 | | |
126 | 100 | | |
127 | 101 | | |
128 | | - | |
129 | 102 | | |
130 | 103 | | |
131 | 104 | | |
| |||
139 | 112 | | |
140 | 113 | | |
141 | 114 | | |
142 | | - | |
143 | 115 | | |
144 | 116 | | |
145 | 117 | | |
| |||
174 | 146 | | |
175 | 147 | | |
176 | 148 | | |
177 | | - | |
178 | 149 | | |
179 | 150 | | |
180 | 151 | | |
181 | 152 | | |
182 | 153 | | |
183 | 154 | | |
184 | | - | |
185 | 155 | | |
186 | 156 | | |
187 | 157 | | |
| |||
190 | 160 | | |
191 | 161 | | |
192 | 162 | | |
193 | | - | |
194 | 163 | | |
195 | 164 | | |
196 | 165 | | |
197 | 166 | | |
198 | 167 | | |
199 | 168 | | |
200 | | - | |
201 | 169 | | |
202 | 170 | | |
203 | 171 | | |
204 | 172 | | |
205 | 173 | | |
206 | 174 | | |
207 | 175 | | |
208 | | - | |
209 | 176 | | |
210 | 177 | | |
211 | 178 | | |
212 | 179 | | |
213 | 180 | | |
214 | 181 | | |
215 | | - | |
216 | 182 | | |
217 | 183 | | |
218 | 184 | | |
| |||
225 | 191 | | |
226 | 192 | | |
227 | 193 | | |
228 | | - | |
229 | 194 | | |
230 | 195 | | |
231 | 196 | | |
| |||
247 | 212 | | |
248 | 213 | | |
249 | 214 | | |
250 | | - | |
251 | 215 | | |
252 | 216 | | |
253 | 217 | | |
254 | | - | |
255 | 218 | | |
256 | 219 | | |
257 | 220 | | |
258 | 221 | | |
259 | 222 | | |
260 | 223 | | |
261 | 224 | | |
262 | | - | |
263 | 225 | | |
264 | 226 | | |
265 | 227 | | |
266 | | - | |
267 | 228 | | |
268 | 229 | | |
269 | 230 | | |
| |||
272 | 233 | | |
273 | 234 | | |
274 | 235 | | |
275 | | - | |
276 | 236 | | |
277 | 237 | | |
278 | 238 | | |
279 | | - | |
280 | 239 | | |
281 | 240 | | |
282 | 241 | | |
| |||
321 | 280 | | |
322 | 281 | | |
323 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
324 | 340 | | |
325 | 341 | | |
326 | 342 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
288 | 288 | | |
289 | 289 | | |
290 | 290 | | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | 291 | | |
298 | 292 | | |
299 | 293 | | |
| |||
0 commit comments