You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
159853: kvnemesis: enable node crashes to kvnemesis r=dodeca12 a=dodeca12
Previously, `kvnemesis` could only simulate graceful node stops and restarts, which limited its ability to test crash recovery scenarios. This was inadequate because real-world failures often involve abrupt crashes of nodes, leaving data in an inconsistent state that must be recovered on restart.
To address this, this PR adds crash operation support to `kvnemesis` through three commits:
1. Renames the `Restarter` interface to `ServerController` to better reflect its broader purpose of controlling server lifecycle operations, preparing for the addition of crash operations.
2. Extends `testcluster.TestCluster` with a `CrashServer` method that emulates a crash by stopping a server and creating a snapshot of its in-memory filesystems at the last sync point using `vfs.MemFS.CrashClone`. This simulates what would persist on disk after a real crash. The method also isolates the crashed node from peers by tripping circuit breakers, simulating network partition behavior. Adds `CrashNodeOperation` to the kvnemesis protobuf schema, integrates it into the generator to randomly crash nodes during test runs, implements crash application in the applier, and updates the validator to handle crash scenarios. The generator now tracks crashed nodes separately from stopped nodes.
3. Enables crash operations in kvnemesis test configurations by adding `removeRandCrashed` and `removeRandStoppedOrCrashed` methods to the nodes tracker. The `restartRandNode` function now randomly selects from both stopped and crashed nodes. Adds a new test `TestKVNemesisMultiNode_Crash_Liveness` that exercises crash operations with strict in-memory filesystem mode, which is required for proper crash simulation.
Fixes#64828
Co-authored-by: Swapneeth Gorantla <[email protected]>
0 commit comments