Redis chart: Redis preStop can block pod termination indefinitely if Sentinel terminates early (regression after #35364)

### Name and Version

bitnami/redis 23.2.12

### What steps will reproduce the bug?

1. Deploy the Bitnami Redis Helm chart with Sentinel enabled in a Kubernetes cluster in a master/replica mode (example: 3-node StatefulSet).
2. Configure a long termination grace period (e.g. `terminationGracePeriodSeconds: 900`) so the pod has enough time to execute the preStop logic.
3. Trigger pod termination, e.g. `kubectl rollout restart statefulset <redis-sts-name>`
4. Observe that the Sentinel container terminates quickly, while the Redis container remains stuck in Terminating for the pod with the Redis master.

### Are you using any custom parameters or values?

Yes, the following values are relevant:

```yaml
replica:
  terminationGracePeriodSeconds: 900
```

### What is the expected behavior?

When Kubernetes terminates the pod, Redis should terminate gracefully after executing its preStop hook.

If Sentinel terminates earlier than Redis (which can happen depending on container termination order), Redis should still be able to finish preStop and exit, rather than waiting indefinitely.

Redis should not block termination solely because the Sentinel process is no longer reachable.

### What do you see instead?

Redis master can remain stuck in Terminating until `terminationGracePeriodSeconds` expires.

The Redis preStop hook keeps retrying indefinitely because Sentinel is no longer running and `get-master-addr-by-name mymaster` returns no output.

The recently introduced check:

```bash
if [[ -z "$REDIS_MASTER_HOST" ]]; then
    echo "WARNING: REDIS_MASTER_HOST is empty, assuming failover not finished"
    return 1
fi
```

treats empty output as “failover not finished”, which causes the retry loop to never complete once Sentinel exits.

While blocked, Redis pauses writes:

```
CLIENT PAUSE 892000 WRITE
```

As a result:
- Redis does not receive SIGTERM until forced termination occurs
- Pod stays stuck in Terminating
- Writes are paused for the duration of the grace period

### Additional information

This behavior appears to be introduced by: https://github.com/bitnami/charts/pull/35364 and affects all versions since.

Before that change, the preStop hook could exit even if Sentinel stopped responding, allowing Redis to terminate.

After that change, Redis termination depends on Sentinel remaining alive throughout the preStop hook, which creates a race condition during pod shutdown (Sentinel may terminate before Redis preStop completes).

#### Sentinel failover may also not complete gracefully during shutdown

In addition, due to login in the pre-stop script, Sentinel itself can be terminated while it is actively performing a failover (for example after it has selected and promoted a new master, but before it has finished the “reconfigure replicas” and “failover end” stages). If Sentinel receives SIGTERM during this process, the failover may remain incomplete and the new master configuration may not be fully propagated or acknowledged by other Sentinels. This can result in inconsistent cluster state (for example: the promoted node reverting back to replica role, or other Sentinels continuing to believe the old master is still active).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis chart: Redis preStop can block pod termination indefinitely if Sentinel terminates early (regression after #35364) #36422

Name and Version

What steps will reproduce the bug?

Are you using any custom parameters or values?

What is the expected behavior?

What do you see instead?

Additional information

Sentinel failover may also not complete gracefully during shutdown

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Redis chart: Redis preStop can block pod termination indefinitely if Sentinel terminates early (regression after #35364) #36422

Description

Name and Version

What steps will reproduce the bug?

Are you using any custom parameters or values?

What is the expected behavior?

What do you see instead?

Additional information

Sentinel failover may also not complete gracefully during shutdown

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions