-
Couldn't load subscription status.
- Fork 36
Duplicate mmap log file at the beginning of darshan_core_shutdown #1053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 4 commits
dad0e9b
e081fe5
99f568b
7af3cb4
14a755c
6a5c60f
56f0c82
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -433,6 +433,10 @@ void darshan_core_shutdown(int write_log) | |||||||
| darshan_record_id *mod_shared_recs = NULL; | ||||||||
| int shared_rec_cnt = 0; | ||||||||
| #endif | ||||||||
| #ifdef __DARSHAN_ENABLE_MMAP_LOGS | ||||||||
| char dup_log_fame[__DARSHAN_PATH_MAX + 5]; | ||||||||
| dup_log_fame[0] = '\0'; | ||||||||
| #endif | ||||||||
|
|
||||||||
| /* disable darhan-core while we shutdown */ | ||||||||
| __DARSHAN_CORE_LOCK(); | ||||||||
|
|
@@ -470,13 +474,34 @@ void darshan_core_shutdown(int write_log) | |||||||
| internal_timing_flag = final_core->config.internal_timing_flag; | ||||||||
|
|
||||||||
| #ifdef __DARSHAN_ENABLE_MMAP_LOGS | ||||||||
| /* remove the temporary mmap log files */ | ||||||||
| /* NOTE: this unlink is not immediate as it must wait for the mapping | ||||||||
| * to no longer be referenced, which in our case happens when the | ||||||||
| * executable exits. If the application terminates mid-shutdown, then | ||||||||
| * there will be no mmap files and no final log file. | ||||||||
| /* Flush memory-mapped data to the underlying file and then duplicate the | ||||||||
| * mmap log file, in case an interrupt happens before the completion of | ||||||||
| * this subroutine, leaving the log file corrupted. See github issue #1052. | ||||||||
| */ | ||||||||
| unlink(final_core->mmap_log_name); | ||||||||
| msync(final_core->log_hdr_p, final_core->mmap_size, MS_SYNC); | ||||||||
|
|
||||||||
| /* Duplicate the log file, Below we on purpose ignore errors from open(), | ||||||||
| * lseek(), read(), write(), and close(), as they are not fatal and should | ||||||||
| * not affect the remaining tasks of this subroutine, i.e. log data file | ||||||||
| * processing. Note if any of these system calls failed, then the contents | ||||||||
| * of duplicated file will become useless, but such problem will be even | ||||||||
| * more serious than being unable to duplicate the file. | ||||||||
| */ | ||||||||
| int mmap_fd = open(final_core->mmap_log_name, O_RDONLY, 0644); | ||||||||
| if (mmap_fd != -1) { | ||||||||
| int dup_mmap_fd; | ||||||||
| void *buf = (void*) malloc(final_core->mmap_size); | ||||||||
| read(mmap_fd, buf, final_core->mmap_size); | ||||||||
| close(mmap_fd); | ||||||||
| snprintf(dup_log_fame, strlen(final_core->mmap_log_name), "%s.dup", | ||||||||
|
||||||||
| snprintf(core->mmap_log_name, __DARSHAN_PATH_MAX, | |
| "/%s/%s_%s_id%d_mmap-log-%" PRIu64 "-%d.darshan", | |
| core->config.mmap_log_path, cuser, __progname, jobid, logmod, my_rank); |
Where do you suggest to insert substring "dup"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be other things in /tmp with "darshan" in the name that aren't log files, though (I have a bunch for sure, though I'm an outlier because I've been experimenting with a lot of Darshan things :))
I'm not picky about where to put dup in the name (other than not at the end). Maybe just making the extension .dup.darshan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Please give it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this block of code produces the following warning with gcc 14.2:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ignored them on purpose, as these codes are for file duplication, wouldn't
want any error stop the duplication.
Anyway, I added checks in 14a755c.
Strangely, I also run gcc 14.2.1, but did not get those warnings.
Please see if they are silenced.