You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: set default tcp_user_timeout to 5 seconds for replicas (cloudnative-pg#9317)
The default `tcp_user_timeout` for standby replication connections has
been changed from the system default to `5000ms` (5 seconds) for all
replicas.
This new default enhances the robustness of CloudNativePG clusters by
enabling standby instances to detect and recover from network issues
more quickly. Previously, silent network drops could cause standbys to
wait up to ~127 seconds (due to TCP SYN retries) before detecting a
failure. With the new 5-second timeout, standbys will close unresponsive
connections sooner and promptly retry connecting to the primary.
If this default does not meet your requirements, you can override it for
all standbys managed by the operator using the
`STANDBY_TCP_USER_TIMEOUT` configuration option.
PRESERVATION GUIDE FOR EXISTING INSTALLATIONS:
If you have an existing CloudNativePG installation where
`STANDBY_TCP_USER_TIMEOUT` was not explicitly set (thus defaulting to
`0`), and you wish to preserve that behaviour after upgrading, you must
now explicitly set it to `0`.
Example using a `ConfigMap`:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cnpg-controller-manager-config
namespace: cnpg-system
data:
STANDY_TCP_USER_TIMEOUT: "0"
```
If the variable is not explicitly configured, the new default of 5
seconds will automatically apply after the next operator upgrade or pod
restart.
For more information on `tcp_user_timeout`, see the PostgreSQL
documentation:
https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-TCP-USER-TIMEOUTClosescloudnative-pg#9229
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Co-authored-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Copy file name to clipboardExpand all lines: docs/src/operator_conf.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,7 +59,7 @@ Name | Description
59
59
`PGBOUNCER_IMAGE_NAME` | The name of the PgBouncer image used by default for new poolers. Defaults to the version specified in the operator.
60
60
`POSTGRES_IMAGE_NAME` | The name of the PostgreSQL image used by default for new clusters. Defaults to the version specified in the operator.
61
61
`PULL_SECRET_NAME` | Name of an additional pull secret to be defined in the operator's namespace and to be used to download images
62
-
`STANDBY_TCP_USER_TIMEOUT` | Defines the [`TCP_USER_TIMEOUT` socket option](https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-TCP-USER-TIMEOUT) for replication connections from standby instances to the primary. Default is 0 (system's default).
62
+
`STANDBY_TCP_USER_TIMEOUT` | Defines the [`TCP_USER_TIMEOUT` socket option](https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-TCP-USER-TIMEOUT)in milliseconds for replication connections from standby instances to the primary. Default is 5000 (5 seconds). Set to `0` to use the system's default.
63
63
`DRAIN_TAINTS` | Specifies the taint keys that should be interpreted as indicators of node drain. By default, it includes the taints commonly applied by [kubectl](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/), [Cluster Autoscaler](https://github.com/kubernetes/autoscaler), and [Karpenter](https://github.com/aws/karpenter-provider-aws): `node.kubernetes.io/unschedulable`, `ToBeDeletedByClusterAutoscaler`, `karpenter.sh/disrupted`, `karpenter.sh/disruption`.
64
64
65
65
Values in `INHERITED_ANNOTATIONS` and `INHERITED_LABELS` support path-like wildcards. For example, the value `example.com/*` will match
0 commit comments