Persistent Checkpointing (#2184) #3918
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the first version of a PR that attempts to provide the functionality requested by issue #2184
Both the replay and rerun command now takes --ignore-pcp to ignore any PCP's and I've made spawning from PCP the default behavior of both commands.
2 commands has also been added to the spawned GDB; write-checkpoints and load-checkpoints.
The last point about persistent checkpoints being created at record time is not provided by this PR, but I'm willing to attempt to add that in a future PR, now that I have a little insight into how this would/could/maybe should work.
At this time, little to no optimizations are performed. Each mapping in the process address space is serialized to disk and it is currently not compressed in any way. Compressing data that goes into anonymous mappings should be fairly simple to implement as this data will get copied into memory during restore of a PCP, while file backed mappings (like executable data for instance) can not be compressed as easily. One wants to map as much file backed as possible, as this is not necessarily committed to physical memory immediately, which is the case with copying data into mappings.
Other optimizations that possibly could be done, is to instead of creating each checkpoints "from scratch", is to during restore of PCP's, reconstitute the first one (at event N), then when reconstituting the following checkpoint, fork the first and make the changes required to that one. As it stands right now, it creates a new session for each checkpoint. Theoretically this consumes more memory. Forking checkpoint N+1 from N and changing the address space where needed, I think would mean that less memory is used, I think.
Also, if anybody has any ideas on how one could possibly write tests for something like this, they would be most welcome to share those thoughts with me.
This is a new PR that continues from #3406 because it just refused to not claim that the PR had merge-commits.