Commit 6fc5565
[CELEBORN-2291] Support fsync on commit to ensure shuffle data durability
### What changes were proposed in this pull request?
Add a new configuration `celeborn.worker.commitFiles.fsync` (default `false`) that calls `FileChannel.force(false)` (fdatasync) before closing the channel in
`LocalTierWriter.closeStreams()`.
### Why are the changes needed?
Without this, committed shuffle data can sit in the OS page cache before the kernel flushes it to disk. A hard crash in that window loses data even though Celeborn considers it committed. This option lets operators opt into stronger durability guarantees.
### Does this PR resolve a correctness bug?
No. It adds an optional durability enhancement.
### Does this PR introduce _any_ user-facing change?
Yes. New configuration key `celeborn.worker.commitFiles.fsync` (boolean, default `false`).
### How was this patch tested?
Existing unit tests. Configuration verified via `ConfigurationSuite` and for LocalTierWriter added a new test with fsync enabled and ran `TierWriterSuite`.
Additional context: [slack](https://apachecelebor-kw08030.slack.com/archives/C04B1FYS6SY/p1774259245973229)
Closes #3635 from kaybhutani/kartikay/fsync-on-commit.
Authored-by: Kartikay Bhutani <kbhutani0001@gmail.com>
Signed-off-by: 子懿 <ziyi.jxf@antgroup.com>1 parent 3773c65 commit 6fc5565
File tree
4 files changed
+43
-3
lines changed- common/src/main/scala/org/apache/celeborn/common
- docs/configuration
- worker/src
- main/scala/org/apache/celeborn/service/deploy/worker/storage
- test/scala/org/apache/celeborn/service/deploy/worker/storage
4 files changed
+43
-3
lines changedLines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
859 | 859 | | |
860 | 860 | | |
861 | 861 | | |
| 862 | + | |
862 | 863 | | |
863 | 864 | | |
864 | 865 | | |
| |||
3770 | 3771 | | |
3771 | 3772 | | |
3772 | 3773 | | |
| 3774 | + | |
| 3775 | + | |
| 3776 | + | |
| 3777 | + | |
| 3778 | + | |
| 3779 | + | |
| 3780 | + | |
| 3781 | + | |
| 3782 | + | |
| 3783 | + | |
3773 | 3784 | | |
3774 | 3785 | | |
3775 | 3786 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
| 66 | + | |
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| |||
Lines changed: 10 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
418 | 418 | | |
419 | 419 | | |
420 | 420 | | |
| 421 | + | |
421 | 422 | | |
422 | 423 | | |
423 | 424 | | |
| |||
458 | 459 | | |
459 | 460 | | |
460 | 461 | | |
461 | | - | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
462 | 471 | | |
463 | 472 | | |
464 | 473 | | |
| |||
Lines changed: 21 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
178 | | - | |
179 | | - | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
180 | 181 | | |
181 | 182 | | |
182 | 183 | | |
| |||
314 | 315 | | |
315 | 316 | | |
316 | 317 | | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
317 | 336 | | |
0 commit comments