Skip to content

[BUG] dual-channel-replication-enabled causes "duplicate" replica on Sentinel #2338

@jdheyburn

Description

@jdheyburn

Describe the bug

We run Valkey on K8s as a statefulset. A statefulset has 3 pods, each with a Valkey container and a Sentinel container for managing HA. One of these pods is the master, and the other 2 are replicas. Pod IPs are ephemeral, so as pods cycle they will come up with a new IP address which Sentinel will detect as a new replica, marking the old one down. This causes stale replicas in the output of sentinel replicas master_name.

To mitigate this we configure K8s Services, one for each pod. We have an init container on the pods to find the ClusterIP for their respective Service, and then configures itself with replica-announce-ip for Valkey and announce-ip for Sentinel. This allows the replica IPs to remain the same as pods are cycled.

We recently migrated some of our workloads to Valkey 8.1.1 and are trialling the dual-channel-replication-enabled yes config. When enabled during a failover, the additional channel of type=rdb-channel appears as the pod IP. If sentinel were to be polling for replicas during this time, it will save this as a distinct replica. When the replica fully resyncs, it will use the announce IP address configured.

This duplication is bad for Valkey backed by Sentinel, as during a failover Sentinel would elect a new master from the list of replicas, and will instruct the new master to replicate from itself, which it cannot do.

Pod IPs

NAME                  READY   STATUS    RESTARTS   AGE     IP             
redis-test-0-server-0   5/5     Running   0          8m12s   100.104.75.103 
redis-test-0-server-1   5/5     Running   0          7m21s   100.111.250.31 
redis-test-0-server-2   5/5     Running   0          12m     100.99.64.48   

Service Cluster IPs

NAME                    TYPE        CLUSTER-IP     
redis-test-0-announce-0   ClusterIP   100.67.128.152 
redis-test-0-announce-1   ClusterIP   100.70.100.164 
redis-test-0-announce-2   ClusterIP   100.71.105.52  

In this scenario, redis-test-0-server-1 has been promoted to a master after a failover. The output of info replication is:

# Replication
role:master
connected_slaves:2
slave0:ip=100.71.105.52,port=6379,state=online,offset=7448294713549,lag=1,type=replica
slave1:ip=100.104.75.103,port=6379,state=wait_bgsave,offset=0,lag=0,type=rdb-channel
  • slave0 is the ClusterIP for redis-test-0-announce-2, this maps to pod redis-test-0-server-2.
  • slave1 is the Pod IP for redis-test-0-server-0

When slave1 resyncs, the output of info replication is:

# Replication
role:master
connected_slaves:2
slave0:ip=100.71.105.52,port=6379,state=online,offset=7448295098696,lag=1,type=replica
slave1:ip=100.67.128.152,port=6379,state=online,offset=7448295102193,lag=1,type=replica
  • slave0 is unchanged to what it was previously
  • slave1 is now the ClusterIP for redis-test-0-announce-0, this maps to pod redis-test-0-server-0.

When executing sentinel replicas master_name | grep name -A1, there are 4 replicas.

/data $ redis-cli -p 26379 sentinel replicas master_name | grep name -A1
name
100.99.64.48:6379
--
name
100.104.75.103:6379
--
name
100.67.128.152:6379
--
name
100.71.105.52:6379

In order they are:

  • pod IP for redis-test-0-server-2
  • pod IP for redis-test-0-server-0
  • ClusterIP for redis-test-0-announce-2 (redis-test-0-server-2)
  • ClusterIP for redis-test-0-announce-0 (redis-test-0-server-0)

Issuing sentinel reset master_name fixes this list. But this is not an appropriate solution given that pods can cycle for any reason and would resync with the master with a new pod IP.

To reproduce

  • Establish any environment with announce IPs, backed by Sentinel
  • Enable dual-channel-replication-enabled yes
  • Execute a failover
  • Retrieve replicas from Sentinel via sentinel failover master_name

Expected behavior

Stale/duplicate replicas are not persisted when a failover happens, or when pods are cycled.

Additional information

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions