Commit 791373f
Enable logging for the plan() function, ShardEstimators and TrainingPipeline class constructors (#3565)
Summary:
Pull Request resolved: #3565
This diff enables the static logging functionality to collect data for:
plan() - This will allow us to log the inputs and outputs to the planner to help with use issue debugging
ShardEstimators - This will allow us to log the inputs and outputs to the ShardEstimators, which gives us the bandwidth inputs to verify if the planner is generating expected values as well as help with debugging OOMs
TrainingPipeline - The class type here will be an indicator of which pipeline was used by the training job. The training pipeline has implications on the memory usage and is an important data point to collect to investigate OOMs.
Reviewed By: kausv, nimaelyasi
Differential Revision: D87488015
fbshipit-source-id: db8754e5e4c7b5e5a2b2e6bf7c6a4e0c6171cf711 parent 0ce549d commit 791373f
File tree
6 files changed
+16
-0
lines changed- torchrec
- distributed
- planner
- train_pipeline
- modules
6 files changed
+16
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| |||
466 | 467 | | |
467 | 468 | | |
468 | 469 | | |
| 470 | + | |
469 | 471 | | |
470 | 472 | | |
471 | 473 | | |
| |||
2021 | 2023 | | |
2022 | 2024 | | |
2023 | 2025 | | |
| 2026 | + | |
2024 | 2027 | | |
2025 | 2028 | | |
2026 | 2029 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
498 | 499 | | |
499 | 500 | | |
500 | 501 | | |
| 502 | + | |
501 | 503 | | |
502 | 504 | | |
503 | 505 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
955 | 956 | | |
956 | 957 | | |
957 | 958 | | |
| 959 | + | |
958 | 960 | | |
959 | 961 | | |
960 | 962 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
146 | 147 | | |
147 | 148 | | |
148 | 149 | | |
| 150 | + | |
149 | 151 | | |
150 | 152 | | |
151 | 153 | | |
| |||
194 | 196 | | |
195 | 197 | | |
196 | 198 | | |
| 199 | + | |
197 | 200 | | |
198 | 201 | | |
199 | 202 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
106 | 107 | | |
107 | 108 | | |
108 | 109 | | |
| 110 | + | |
| 111 | + | |
109 | 112 | | |
110 | 113 | | |
111 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| 129 | + | |
128 | 130 | | |
129 | 131 | | |
130 | 132 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| 169 | + | |
167 | 170 | | |
168 | 171 | | |
169 | 172 | | |
| |||
0 commit comments