-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
Today all the shard-level operations during snapshot deletion will log exceptions on failure, but the deletion process continues regardless. This makes sense in the pre-7.6.0 repository format because the shard-level operations happen after updating the root RepositoryData
blob, at which point the deletion cannot really fail. But since 7.6.0 we do the shard-level operations first, in order to obtain the names of all the new BlobStoreIndexShardSnapshots
blobs. That means we could choose to be stricter and bail out, failing the deletion process before updating the root RepositoryData
blob. IMO there's no great reason to be lenient here, a failure to update the shard-level metadata is surely serious enough to halt the process, and stopping on failure avoids bringing the repository into a state where the shard-level metadata is inconsistent with the root.
Opening this for discussion: should we treat these exceptions more seriously now?