Commit c7fe14b
Fix ModelCheckpoint file_exists OOM in DDP (Lightning-AI#21380)
* Fix ModelCheckpoint.file_exists OOM in DDP
* Document ModelCheckpoint.file_exists DDP memory fix
* Update src/lightning/pytorch/callbacks/model_checkpoint.py
---------
Co-authored-by: Justus Schock <[email protected]>1 parent ca73908 commit c7fe14b
File tree
3 files changed
+33
-3
lines changed- src/lightning/pytorch
- callbacks
- tests/tests_pytorch/checkpointing
3 files changed
+33
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
85 | 88 | | |
86 | 89 | | |
87 | 90 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1001 | 1001 | | |
1002 | 1002 | | |
1003 | 1003 | | |
1004 | | - | |
| 1004 | + | |
1005 | 1005 | | |
1006 | | - | |
1007 | | - | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
1008 | 1010 | | |
1009 | 1011 | | |
1010 | 1012 | | |
| |||
Lines changed: 25 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
0 commit comments