Commit 1c20b38
committed
Fix ddp_notebook CUDA fork check to allow passive initialization
The previous implementation used torch.cuda.is_initialized() which returns
True even when CUDA is passively initialized (e.g., during library imports
or device availability checks). This caused false positives in environments
like Kaggle notebooks where libraries may query CUDA without creating a
context.
This fix uses PyTorch's internal torch.cuda._is_in_bad_fork() function,
which more accurately detects when we're in an actual bad fork state (i.e.,
CUDA was initialized with a context and then the process was forked).
The change allows passive CUDA initialization while still catching genuine
problematic cases. Falls back to the old check for older PyTorch versions
that don't have _is_in_bad_fork.
Fixes #213891 parent 79ffe50 commit 1c20b38
File tree
1 file changed
+23
-11
lines changed- src/lightning/fabric/strategies/launchers
1 file changed
+23
-11
lines changedLines changed: 23 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
209 | 221 | | |
210 | 222 | | |
211 | 223 | | |
| |||
0 commit comments