You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
144724: backup: fix compaction issue with attempting to write keys outside of span r=msbutler a=kev-cao
After the fix applied by #144652, the test failures outlined in #144216 resurfaced as job failures due to attempts by the `SSTSinkKeyWriter` to write keys outside of the `BackupManifest_File` range. The behavior was unexpected, because for some key `/Table/1/1/5`, we would see an attempt to write it to a span starting with `/Table/1/1/5/1`, despite the the last `SSTSinkKeyWriter.Reset` call being made on a span starting with `/Table/1/1/1`.
This was ultimately determined to be caused by the fact that in the compaction processor's `compactSpanEntry`, we re-use the same underlying memory for the key that is being passed to `SSTSinkKeyWriter.WriteKey` to save on memory allocations. However, if the sink performed a flush due to size constraints in `maybeDoSizeFlush`, the start key of the span used to reset the sink referenced the same memory location as key that was passed in. So for subsequent keys in that span that were written, the re-use of that underlying memory in `compactSpanEntry` would consistently mutate the span referenced by that `BackupManifest_File`, causing corruption. This would usually result in job failures as eventually a key may be written outside of the span, but occasionally result in insidious job successes and the resulting `BackupManifest_File` would report a far smaller span than it actually covered.
One solution was to perform a clone of the key in `maybeDoSizeFlush` when creating the span to reset the sink with, but ultimately decided to instead ensure that anytime `SSTSinkKeyWriter.Reset` is called, we pass a clone of the span to the `BackupManifest_File`. This ensures that once `Reset` is called, the caller is free to reuse the underlying memory of the span however they wish. The same holds true for the key passed to `WriteKey`, as in any instances in which the passed in key is actually persisted, we always persist a copy.
Fixes: #144216, #144339
Release note: None
Co-authored-by: Kevin Cao <[email protected]>
0 commit comments