-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Thanks for creating dSQ!
If I understand correctly, when dSQ saves a job status entry it assumes that appending to a file is atomic. However, this is not true on NFS filesystems. Although unlikely, it is theoretically possible corrupt a status file without additional locking (i.e. flock/fcntl). I just want to raise this potential issue since Grace now uses NFSv3 for home folders and scratch.
https://www.man7.org/linux/man-pages/man2/open.2.html
O_APPEND may lead to corrupted files on NFS filesystems if more than one process appends data to a file at once. This is because NFS does not support appending to a file, so the client kernel has to simulate it, which can't be done without a race condition.
Lines 121 to 129 in 2d37334
| with open( | |
| path.join(args.status_dir[0], "job_{}_status.tsv".format(jid)), "a" | |
| ) as out_status: | |
| print( | |
| "{Array_Task_ID}\t{Exit_Code}\t{Hostname}\t{T_Start}\t{T_End}\t{T_Elapsed:.02f}\t{Task}".format( | |
| **out_dict | |
| ), | |
| file=out_status, | |
| ) |