Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion local-antora-playbook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ content:
- url: https://github.com/redpanda-data/docs
branches: [v/*, shared, site-search,'!v-end-of-life/*']
- url: https://github.com/redpanda-data/cloud-docs
branches: 'main'
branches: 'DOC-1673-single-source-client-connections-in-cloud-docs'
- url: https://github.com/redpanda-data/redpanda-labs
branches: main
start_paths: [docs,'*/docs']
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
= Configure Client Connections
:description: Guidelines for configuring Redpanda clusters for optimal availability.
:description: Learn about guidelines for configuring client connections in Redpanda clusters for optimal availability.
:page-categories: Management, Networking
// tag::single-source[]

Optimize the availability of your clusters by configuring and tuning properties.

Expand All @@ -10,14 +11,20 @@ A malicious Kafka client application may create many network connections to exec

The following Redpanda cluster properties limit the number of connections:

* xref:reference:cluster-properties.adoc#kafka_connections_max[`kafka_connections_max`]: Similar to Kafka's `max.connections`, this sets the maximum number of connections per broker.
* xref:reference:cluster-properties.adoc#kafka_connections_max_per_ip[`kafka_connections_max_per_ip`]: Similar to Kafka's `max.connections.per.ip`, this sets the maximum number of connections accepted per IP address by a broker.
* xref:reference:cluster-properties.adoc#kafka_connections_max_overrides[`kafka_connections_max_overrides`]: A list of IP addresses for which `kafka_connections_max_per_ip` is overridden and doesn't apply.
* xref:reference:properties/cluster-properties.adoc#kafka_connections_max_per_ip[`kafka_connections_max_per_ip`]: Similar to Kafka's `max.connections.per.ip`, this sets the maximum number of connections accepted per IP address by a broker.
* xref:reference:properties/cluster-properties.adoc#kafka_connections_max_overrides[`kafka_connections_max_overrides`]: A list of IP addresses for which `kafka_connections_max_per_ip` is overridden and doesn't apply.
ifndef::env-cloud[]
* xref:reference:properties/cluster-properties.adoc#kafka_connections_max[`kafka_connections_max`]: Similar to Kafka's `max.connections`, this sets the maximum number of connections per broker.

ifdef::env-cloud[]
IMPORTANT: Per-IP connection controls require Redpanda to see individual client IPs. If clients connect through PrivateLink endpoints, NAT gateways, or other shared-IP egress, the per-IP limit applies to the shared IP, affecting all clients behind it and preventing isolation of a single offending client.
endif::[]

Redpanda also provides properties to manage the rate of connection creation:

* xref:reference:cluster-properties.adoc#kafka_connection_rate_limit[`kafka_connection_rate_limit`]: This property limits the maximum rate of connections created per second. It applies to each CPU core.
* xref:reference:cluster-properties.adoc#kafka_connection_rate_limit_overrides[`kafka_connection_rate_limit_overrides`]: A list of IP addresses for which `kafka_connection_rate_limit` is overridden and doesn't apply.
endif::[]

[NOTE]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisdowns is the note below accurate about connections counts? 'Typically two or three' sounds just wrong. Doesn't a client open a connection for each broker its connected to (or for each partition its producing/consuming to/from).

Its good to first note that num connections != num clients, but I think the message here is the max expected # of connections per client is 'on the order of [insert something]'

cc @micheleRP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, 2 or 3 would be a large underestimate if there are many brokers, and if clients connect to each broker (which is workload dependent).

Here are the full details:

https://redpandadata.atlassian.net/wiki/spaces/CORE/pages/510099463/How+many+connections

That's probably not going to make it into this paragraph, but a conservative estimate is N+2 connections per client where N is the number of brokers.

Copy link
Contributor Author

@micheleRP micheleRP Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @travisdowns! I changed that bullet (typically 2-3 connections per client) to:
The total number of connections is not equal to the number of clients, because a client can open multiple connections. As a conservative estimate, for a cluster with N brokers, plan for N + 2 connections per client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

====
Expand Down Expand Up @@ -46,10 +53,21 @@ See also: xref:develop:produce-data/configure-producers.adoc[Configure Producers

A Redpanda broker may create log segments at startup. If a broker crashes after startup, and if it gets stuck in a crash loop, it could produce progressively more stored state that uses more disk space and takes more time for each restart to process.

ifndef::env-cloud[]
To prevent infinite crash loops, the Redpanda broker property xref:reference:node-properties.adoc#crash_loop_limit[`crash_loop_limit`] sets an upper limit on the number of consecutive crashes that can happen within one hour of each other. After it reaches the limit, a broker cannot restart until its internal consecutive crash counter is reset to zero by one of the following conditions:
endif::[]

ifdef::env-cloud[]
To prevent infinite crash loops, the Redpanda broker property `crash_loop_limit` sets an upper limit on the number of consecutive crashes that can happen within one hour of each other. After it reaches the limit, a broker cannot restart until its internal consecutive crash counter is reset to zero by one of the following conditions:
Copy link
Member

@c4milo c4milo Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not exposed in Redpanda Cloud, we manage it internally. We also don't have plans to expose it; it makes no absolute sense for a managed service like ours.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this in production?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm surprised to see anything related to crash loop prevention on this page as its a completely different topic for admins (not about clients but brokers) and indeed also not planned to be exposed in cloud. I think this whole second section of this page about crash loops should just be removed entirely

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've conditionalized out this Prevent crash loops section for Cloud docs. @pgellert: Looks like this was updated with #966. Please see Matt's comment below, and then can you please confirm that this section should remain documented for Self-Managed docs? Is there a better location for this content, some other page?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crash loop tracking is documented in the kubernetes troubleshooting docs as well here: https://docs.redpanda.com/current/troubleshoot/errors-solutions/k-resolve-errors/#crash-loop-backoffs

I think those troubleshooting docs + the detailed description around the cluster configs themselves here are sufficient and we can remove these paragraphs from the "Configure Client Connections" page.

An alternative would be to have them on a separate page under Self-managed > Cluster Maintenance (self-managed only; excluded from cloud docs), but I think the cluster config descriptions are detailed enough that they are sufficient and we don't need a separate page for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @pgellert!

endif::[]

* The `redpanda.yaml` configuration file is updated.
ifndef::env-cloud[]
* The `startup_log` file in the broker's xref:reference:node-properties.adoc#data_directory[data_directory] is manually deleted.
endif::[]
ifdef::env-cloud[]
* The `startup_log` file in the broker's `data_directory` is manually deleted.
endif::[]
* One hour has elapsed since the last crash.
* The broker is properly shut down. (This is not possible after `crash_loop_limit` has been reached and the broker cannot be restarted.)

Expand All @@ -59,4 +77,12 @@ To prevent infinite crash loops, the Redpanda broker property xref:reference:nod
* If the limit is less than two, the broker is blocked from restarting after every crash, until one of the reset conditions is met.
====

To facilitate debugging in environments where a broker is stuck in a crash loop, set the xref:reference:properties/broker-properties.adoc#crash_loop_sleep_sec[`crash_loop_sleep_sec` configuration]. This setting determines how long the broker sleeps before terminating the process after reaching the crash loop limit. The window during which the broker remains available allows you to troubleshoot the issue. This setting is most useful when xref:troubleshoot:errors-solutions/k-resolve-errors.adoc[troubleshooting in Kubernetes environments].
ifndef::env-cloud[]
To facilitate debugging in environments where a broker is stuck in a crash loop, set the xref:reference:properties/broker-properties.adoc#crash_loop_sleep_sec[`crash_loop_sleep_sec`] broker property. This setting determines how long the broker sleeps before terminating the process after reaching the crash loop limit. The window during which the broker remains available allows you to troubleshoot the issue. This setting is most useful when xref:troubleshoot:errors-solutions/k-resolve-errors.adoc[troubleshooting in Kubernetes environments].
endif::[]

ifdef::env-cloud[]
To facilitate debugging in environments where a broker is stuck in a crash loop, set the `crash_loop_sleep_sec` broker property. This setting determines how long the broker sleeps before terminating the process after reaching the crash loop limit. The window during which the broker remains available allows you to troubleshoot the issue.
endif::[]

// end::single-source[]
11 changes: 11 additions & 0 deletions modules/reference/pages/properties/cluster-properties.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2862,6 +2862,7 @@ Maximum number of Kafka client connections per broker. If `null`, the property i

---

// tag::kafka_connections_max_overrides[]
=== kafka_connections_max_overrides

A list of IP addresses for which Kafka client connection limits are overridden and don't apply. For example, `(['127.0.0.1:90', '50.20.1.1:40']).`.
Expand All @@ -2872,14 +2873,20 @@ A list of IP addresses for which Kafka client connection limits are overridden a

*Type:* array

ifndef::env-cloud[]
*Default*: `{}` (empty list)
endif::[]

*Related topics*:

* xref:manage:cluster-maintenance/configure-availability.adoc#limit-client-connections[Limit client connections]

---


// end::kafka_connections_max_overrides[]

// tag::kafka_connections_max_per_ip[]
=== kafka_connections_max_per_ip

Maximum number of Kafka client connections per IP address, per broker. If `null`, the property is disabled.
Expand All @@ -2892,14 +2899,18 @@ Maximum number of Kafka client connections per IP address, per broker. If `null`

*Accepted values:* [`0`, `4294967295`]

ifndef::env-cloud[]
*Default:* `null`
endif::[]

*Related topics*:

* xref:manage:cluster-maintenance/configure-availability.adoc#limit-client-connections[Limit client connections]

---

// end::kafka_connections_max_per_ip[]

=== kafka_enable_authorization

Flag to require authorization for Kafka connections. If `null`, the property is disabled, and authorization is instead enabled by <<enable_sasl,`enable_sasl`>>.
Expand Down