-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Observed behavior
Creating a stream snapshot has an edge case where consumers with fewer replicas than the stream, and where a consumer replica is not on the same node as the stream leader, then the actual tar data of the snapshot does not include the consumer data. It still reports the correct consumer_count, but the payload silently omits the relevant data. The backup appears to succeed, but at restore time the consumers are not available to restore.
Expected behavior
The snapshot should include all the consumers, even if there isn't a replica of the consumer on the leader node. At the very least it should report a warning or error that it couldn't include the consumer(s) that didn't have local replicas.
Server and client version
Reproduced on both v2.12.4 and v2.11.11. Used the v0.3.1 of the natscli. As an aside, natscli#631 reported this issue against server v2.9.6, but the repro mechanism wasn't clear and because it couldn't be reproduced at the time, it was closed.
Host environment
No response
Steps to reproduce
On a minimum 3-node cluster, and an r3 stream, create a consumer for that stream that is r1.
If the stream leader is the same as the consumer leader, step down the stream leader until they differ.
Perform the snapshot operation.
Note that the API responds with the correct consumer_count: 1.
Verify that the backup.json has consumer_count: 0 and that there is no obs/ directory in the tar.
As final evidence, restore the snapshot and see that there is no consumer restored.
Also, note; if the consumer and stream leader are on the same node then it works as expected.