-
Notifications
You must be signed in to change notification settings - Fork 400
Description
Hi,
While in the general case inspect log everything properly so that any interrupted evaluation can be easily respawned and finished with inspect eval-retry, I noticed that under some (unusual) circumstances, the log files can get corrupted and unrecoverable. Such an example is when you hit a disk full error, then inspect might fail to write some bits and this can result in a heavily corrupted (unrecoverable) log file (at least in eval format), as it would be an invalid ZIP archive (further attempts to process it would trigger a zip exception).
This might be an issue when running very long evaluation pipelines across hundreds of thousands of files, and one might expect inspect to be able to fully recover from its log file. Not sure if this is a considered use case for inspect though.
Best,