Skip to content

Commit 51f3d6f

Browse files
dashboard: add failover trigger panel
Add panel to "Cluster overview" section to dashboard templates. Update consists of panel with following metrics: - tnt_cartridge_failover_trigger_total Part of #178
1 parent 41a6dc8 commit 51f3d6f

22 files changed

+6011
-3869
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1010
- Dashboard title customization
1111
- Panels for transaction operations
1212
- Panels with net statistics per thread
13+
- Panel with failover trigger count
1314

1415
### Changed
1516
- Replace LuaJit deprecated metrics with new ones

dashboard/panels/cluster.libsonnet

Lines changed: 31 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -424,7 +424,7 @@ local prometheus = grafana.prometheus;
424424
title=title,
425425
description=description,
426426
datasource=datasource,
427-
panel_width=12,
427+
panel_width=8,
428428
max=1,
429429
min=0,
430430
).addValueMapping(
@@ -450,6 +450,35 @@ local prometheus = grafana.prometheus;
450450
.selectField('value').addConverter('last')
451451
),
452452

453+
failovers_per_second(
454+
title='Failovers triggered',
455+
description=|||
456+
Displays the count of failover triggers in a replicaset.
457+
Graph shows average per second.
458+
459+
Panel works with `metrics >= 0.15.0`.
460+
|||,
461+
datasource_type=null,
462+
datasource=null,
463+
policy=null,
464+
measurement=null,
465+
job=null,
466+
alias=null,
467+
):: common.default_graph(
468+
title=title,
469+
description=description,
470+
datasource=datasource,
471+
labelY1='failovers per second',
472+
panel_width=8,
473+
).addTarget(common.default_rps_target(
474+
datasource_type,
475+
'tnt_cartridge_failover_trigger_total',
476+
job,
477+
policy,
478+
measurement,
479+
alias,
480+
)),
481+
453482
read_only_status(
454483
title='Tarantool instance status',
455484
description=|||
@@ -469,7 +498,7 @@ local prometheus = grafana.prometheus;
469498
title=title,
470499
description=description,
471500
datasource=datasource,
472-
panel_width=12,
501+
panel_width=8,
473502
max=1,
474503
min=0,
475504
).addValueMapping(

dashboard/section.libsonnet

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,14 @@ local tdg_tuples = import 'dashboard/panels/tdg/tuples.libsonnet';
5151
alias=alias,
5252
),
5353

54+
cluster.failovers_per_second(
55+
datasource_type=datasource_type,
56+
datasource=datasource,
57+
policy=policy,
58+
measurement=measurement,
59+
alias=alias,
60+
),
61+
5462
cluster.read_only_status(
5563
datasource_type=datasource_type,
5664
datasource=datasource,
@@ -143,6 +151,13 @@ local tdg_tuples = import 'dashboard/panels/tdg/tuples.libsonnet';
143151
alias=alias,
144152
),
145153

154+
cluster.failovers_per_second(
155+
datasource_type=datasource_type,
156+
datasource=datasource,
157+
job=job,
158+
alias=alias,
159+
),
160+
146161
cluster.read_only_status(
147162
datasource_type=datasource_type,
148163
datasource=datasource,

supported_metrics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ Based on [tarantool/metrics 0.16.0](https://github.com/tarantool/metrics/release
120120
- [x] **tnt_vinyl_scheduler_dump_total**: see *Tarantool vinyl statistics/Vinyl scheduler dump count rate* panel ([#133](https://github.com/tarantool/grafana-dashboard/issues/133))
121121
- [x] **tnt_cartridge_issues**: see *Cluster overview/Cartridge warning issues*, *Cluster overview/Cartridge critical issues* panels ([#55](https://github.com/tarantool/grafana-dashboard/pull/55))
122122
- **tnt_cartridge_cluster_issues**: unsupported (decided not to support: superseded by **tnt_cartridge_issues**)
123-
- [ ] **tnt_cartridge_failover_trigger_total**: ([#178](https://github.com/tarantool/grafana-dashboard/issues/178))
123+
- [x] **tnt_cartridge_failover_trigger_total**: see *Cluster overview/Failovers triggered* panel ([#178](https://github.com/tarantool/grafana-dashboard/issues/178))
124124
- [ ] **tnt_synchro_queue_owner**: ([#178](https://github.com/tarantool/grafana-dashboard/issues/178))
125125
- [ ] **tnt_synchro_queue_term**: ([#178](https://github.com/tarantool/grafana-dashboard/issues/178))
126126
- [ ] **tnt_synchro_queue_len**: ([#178](https://github.com/tarantool/grafana-dashboard/issues/178))

0 commit comments

Comments
 (0)