Commit 5d6ce21
[CELEBORN-1983][FOLLOWUP] Fix fetch fail not throw due to reach spark maxTaskFailures
### What changes were proposed in this pull request?
Fix fetch fail not throw due to reach spark maxTaskFailures.
### Why are the changes needed?
The condition `ti.attemptNumber() >= maxTaskFails - 1` may not be executed. Suppose that the current `taskAttempts` is index0, index1, index2, and index3, and that index0 and index1 have already failed while index2 and index3 are running, and the current `reportFetchFailed` is index3, then the final result will be false, while the expected result will be true.
Therefore, we should check the attemptNumber of the current task separately before the loop starts.
<img width="3558" height="608" alt="image" src="https://github.com/user-attachments/assets/2a0af3e7-912e-420e-a864-4c525d07e251" />
<img width="2332" height="814" alt="image" src="https://github.com/user-attachments/assets/bf832091-56d5-41b8-b58a-502e409d67a8" />
### Does this PR resolve a correctness bug?
No.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing UTs.
Closes #3531 from leixm/follow_CELEBORN-1983.
Authored-by: Xianming Lei <[email protected]>
Signed-off-by: SteNicholas <[email protected]>1 parent cc0d1ba commit 5d6ce21
File tree
2 files changed
+66
-20
lines changed- client-spark
- spark-2/src/main/java/org/apache/spark/shuffle/celeborn
- spark-3/src/main/java/org/apache/spark/shuffle/celeborn
2 files changed
+66
-20
lines changedLines changed: 33 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
335 | 335 | | |
336 | 336 | | |
337 | 337 | | |
| 338 | + | |
| 339 | + | |
338 | 340 | | |
339 | 341 | | |
340 | 342 | | |
341 | 343 | | |
342 | | - | |
| 344 | + | |
343 | 345 | | |
344 | 346 | | |
345 | 347 | | |
346 | 348 | | |
347 | 349 | | |
| 350 | + | |
348 | 351 | | |
349 | 352 | | |
350 | 353 | | |
| |||
362 | 365 | | |
363 | 366 | | |
364 | 367 | | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
370 | | - | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
371 | 377 | | |
372 | 378 | | |
373 | 379 | | |
| 380 | + | |
374 | 381 | | |
375 | | - | |
376 | | - | |
| 382 | + | |
| 383 | + | |
377 | 384 | | |
378 | 385 | | |
379 | 386 | | |
380 | | - | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
381 | 404 | | |
382 | 405 | | |
383 | 406 | | |
| |||
Lines changed: 33 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
450 | 450 | | |
451 | 451 | | |
452 | 452 | | |
| 453 | + | |
| 454 | + | |
453 | 455 | | |
454 | 456 | | |
455 | 457 | | |
456 | 458 | | |
457 | | - | |
| 459 | + | |
458 | 460 | | |
459 | 461 | | |
460 | 462 | | |
461 | 463 | | |
462 | 464 | | |
| 465 | + | |
463 | 466 | | |
464 | 467 | | |
465 | 468 | | |
| |||
477 | 480 | | |
478 | 481 | | |
479 | 482 | | |
480 | | - | |
481 | | - | |
482 | | - | |
483 | | - | |
484 | | - | |
485 | | - | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
486 | 492 | | |
487 | 493 | | |
488 | 494 | | |
| 495 | + | |
489 | 496 | | |
490 | | - | |
491 | | - | |
| 497 | + | |
| 498 | + | |
492 | 499 | | |
493 | 500 | | |
494 | 501 | | |
495 | | - | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
496 | 519 | | |
497 | 520 | | |
498 | 521 | | |
| |||
0 commit comments