Skip to content

Commit 2e25c77

Browse files
kbatuigasJakeSCahill
authored andcommitted
Rolling restart Admin API (#1026)
1 parent 2f27473 commit 2e25c77

File tree

3 files changed

+54
-4
lines changed

3 files changed

+54
-4
lines changed

modules/get-started/pages/whats-new.adoc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,13 @@ This topic includes new content added in version {page-component-version} Beta.
77
* xref:redpanda-cloud:get-started:whats-new-cloud.adoc[]
88
* xref:redpanda-cloud:get-started:cloud-overview.adoc#redpanda-cloud-vs-self-managed-feature-compatibility[Redpanda Cloud vs Self-Managed feature compatibility]
99
10+
== New health probes for broker restarts and upgrades
11+
12+
The Redpanda Admin API now includes new health probes to help you ensure safe broker restarts and upgrades. The xref:api:ROOT:admin-api.adoc#get-/v1/broker/pre_restart_probe[`pre_restart_probe`] endpoint identifies potential risks if a broker is restarted, and xref:api:ROOT:admin-api.adoc#get-/v1/broker/post_restart_probe[`post_restart_probe`] indicates how much of its workloads a broker has reclaimed after the restart. See also:
13+
14+
* xref:manage:cluster-maintenance/rolling-restart.adoc[]
15+
* xref:upgrade:rolling-upgrade.adoc[]
16+
1017
== Redpanda Console v3.0.0 (beta)
1118

1219
The Redpanda Console v3.0.0 beta release includes the following updates:

modules/upgrade/partials/rolling-upgrades/enable-maintenance-mode.adoc

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ rpk cluster health
1010
.Example output:
1111
[%collapsible]
1212
====
13-
[.no-copy]
13+
[,bash,role=no-copy]
1414
----
1515
CLUSTER HEALTH OVERVIEW
1616
=======================
@@ -19,12 +19,40 @@ Controller ID: 0
1919
All nodes: [0 1 2] <2>
2020
Nodes down: [] <3>
2121
Leaderless partitions: [] <3>
22-
Under-replicated partitions: [] <3>
22+
Under-replicated partitions: [1] <3>
2323
----
2424
<1> The cluster is either healthy (`true`) or unhealthy (`false`).
2525
<2> The node IDs of all brokers in the cluster.
2626
<3> If the cluster is unhealthy, these fields will contain data.
27-
====
27+
====
28+
29+
. Optional: You can use the Admin API (default port: 9644) to perform additional checks for potential risks with restarting a specific broker.
30+
+
31+
[,bash]
32+
----
33+
curl -X GET "http://<broker-address>:<admin-api-port>/v1/broker/pre_restart_probe" | jq .
34+
----
35+
+
36+
.Example output:
37+
[,json,role=no-copy]
38+
----
39+
// Returns tuples of partitions (in the format {namespace}/{topic_name}/{partition_id}) affected by the broker restart.
40+
41+
{
42+
"risks": {
43+
"rf1_offline": [
44+
"kafka/topic_a/0"
45+
],
46+
"full_acks_produce_unavailable": [],
47+
"unavailable": [],
48+
"acks1_data_loss": []
49+
}
50+
}
51+
----
52+
+
53+
In this example, the restart probe indicates that there is an under-replicated partition `kafka/topic_a/0` (with a replication factor of 1) at risk of going offline if the broker is restarted.
54+
+
55+
See the xref:api:ROOT:admin-api.adoc#get-/v1/broker/pre_restart_probe[Admin API reference] for more details on the restart probe endpoint.
2856

2957
ifdef::rolling-upgrade[. Select a broker that has not been upgraded yet and place it into maintenance mode:]
3058
ifdef::rolling-restart[. Select a broker and place it into maintenance mode:]

modules/upgrade/partials/rolling-upgrades/post-upgrade-tasks.adoc

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,19 @@ To view additional information about your brokers, run:
1111

1212
```bash
1313
rpk redpanda admin brokers list
14-
```
14+
```
15+
16+
You can also use the xref:api:ROOT:admin-api.adoc#get-/v1/broker/post_restart_probe[Admin API] to check how much each broker has progressed in recovering its workloads:
17+
18+
```bash
19+
curl -X GET "http://<broker-address>:<admin-api-port>/v1/broker/post_restart_probe"
20+
```
21+
22+
.Example output:
23+
[,json,role=no-copy]
24+
----
25+
// Returns the load already reclaimed by broker, as a percentage of in-sync replicas
26+
{
27+
"load_reclaimed_pc": 66
28+
}
29+
----

0 commit comments

Comments
 (0)