Skip to content

Commit 78fc86d

Browse files
authored
Merge pull request #35690 from sbeskin-redhat/BZ1964896_Add_BPG_performance_metrics_to_MTC_Troubleshooting
Bz1964896 add bpg performance metrics to mtc troubleshooting
2 parents a6da76d + 50c52be commit 78fc86d

File tree

4 files changed

+137
-0
lines changed

4 files changed

+137
-0
lines changed

migrating_from_ocp_3_to_4/troubleshooting-3-4.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ This section describes logs and debugging tools that you can use for troubleshoo
2222
include::modules/migration-viewing-migration-plan-resources.adoc[leveloffset=+2]
2323
include::modules/migration-viewing-migration-plan-log.adoc[leveloffset=+2]
2424
include::modules/migration-using-mig-log-reader.adoc[leveloffset=+2]
25+
include::modules/migration-performance-metrics.adoc[leveloffset=+2]
26+
include::modules/migration-accessing-performance-metrics-in-ocp-web-console.adoc[leveloffset=+2]
2527
include::modules/migration-using-must-gather.adoc[leveloffset=+2]
2628
include::modules/migration-debugging-velero-resources.adoc[leveloffset=+2]
2729
include::modules/migration-partial-failure-velero.adoc[leveloffset=+2]

migration_toolkit_for_containers/troubleshooting-mtc.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ This section describes logs and debugging tools that you can use for troubleshoo
2222
include::modules/migration-viewing-migration-plan-resources.adoc[leveloffset=+2]
2323
include::modules/migration-viewing-migration-plan-log.adoc[leveloffset=+2]
2424
include::modules/migration-using-mig-log-reader.adoc[leveloffset=+2]
25+
include::modules/migration-performance-metrics.adoc[leveloffset=+2]
2526
include::modules/migration-using-must-gather.adoc[leveloffset=+2]
2627
include::modules/migration-debugging-velero-resources.adoc[leveloffset=+2]
2728
include::modules/migration-partial-failure-velero.adoc[leveloffset=+2]
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * migrating_from_ocp_3_to_4/troubleshooting-3-4.adoc
4+
// * migration-toolkit-for-containers/troubleshooting-mtc.adoc
5+
6+
[id="migration-accessing-performance-metrics-in-ocp-web-console_{context}"]
7+
= Accessing performance metrics in the {product-title} web console
8+
9+
You can access performance metrics and run queries using the {product-title} web console.
10+
11+
.Procedure
12+
. In the {product-title} 4 web console, click *Monitoring* -> *Metrics*.
13+
14+
. Enter PromQL queries, select a time window to display, and click *Run Queries*.
15+
+
16+
If your web browser does not display all the results, use the Prometheus console.
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * migrating_from_ocp_3_to_4/troubleshooting-3-4.adoc
4+
// * migration-toolkit-for-containers/troubleshooting-mtc.adoc
5+
6+
[id="migration-performance-metrics_{context}"]
7+
= Performance metrics
8+
9+
The `MigrationController` custom resource (CR) records a set of metrics and pulls it into on-cluster monitoring storage. You can query the metrics by using Prometheus Query Language (PromQL) to diagnose migration performance issues. All metrics are reset when the Migration Controller pod restarts.
10+
11+
[id="provided-metrics_{context}"]
12+
== Provided metrics
13+
14+
[id="cam_app_workload_migrations-metric_{context}"]
15+
=== cam_app_workload_migrations
16+
17+
This metric is a count of `MigMigration` CRs over time. It is useful for viewing alongside the `mtc_client_request_count` and `mtc_client_request_elapsed` metrics to collate API request information with migration status changes. This metric is included in Telemetry.
18+
19+
.cam_app_workload_migrations metric
20+
[%header,cols="3,3,3"]
21+
|===
22+
|Queryable label name |Sample label values |Label description
23+
24+
|status
25+
|`running`, `idle`, `failed`, `completed`
26+
|Status of the `MigMigration` CR
27+
28+
|type
29+
|stage, final
30+
|Type of the `MigMigration` CR
31+
|===
32+
33+
[id="mtc_client_request_count-metric_{context}"]
34+
=== mtc_client_request_count
35+
36+
This metric is a cumulative count of Kubernetes API requests that `MigrationController` issued. It is not included in Telemetry.
37+
38+
.mtc_client_request_count metric
39+
[%header,cols="3,3,3"]
40+
|===
41+
|Queryable label name |Sample label values |Label description
42+
43+
|cluster
44+
|`\https://migcluster-url:443`
45+
|Cluster that the request was issued against
46+
47+
|component
48+
|`MigPlan`, `MigCluster`
49+
|Sub-controller API that issued request
50+
51+
|function
52+
|`(*ReconcileMigPlan).Reconcile`
53+
|Function that the request was issued from
54+
55+
|kind
56+
|`SecretList`, `Deployment`
57+
|Kubernetes kind the request was issued for
58+
|===
59+
60+
[id="mtc_client_request_elapsed-metric_{context}"]
61+
=== mtc_client_request_elapsed
62+
63+
This metric is a cumulative latency, in milliseconds, of Kubernetes API requests that `MigrationController` issued. It is not included in Telemetry.
64+
65+
.mtc_client_request_elapsed metric
66+
[%header,cols="3,3,3"]
67+
|===
68+
|Queryable label name |Sample label values |Label description
69+
70+
|cluster
71+
|`\https://cluster-url.com:443`
72+
|Cluster that the request was issued against
73+
74+
|component
75+
|`migplan`, `migcluster`
76+
|Sub-controller API that issued request
77+
78+
|function
79+
|`(*ReconcileMigPlan).Reconcile`
80+
|Function that the request was issued from
81+
82+
|kind
83+
|`SecretList`, `Deployment`
84+
|Kubernetes resource that the request was issued for
85+
|===
86+
87+
[id="useful-queries_{context}"]
88+
== Useful queries
89+
90+
The table lists some helpful queries that can be used for monitoring performance.
91+
92+
.Useful queries
93+
94+
[%header,cols="3,3"]
95+
|===
96+
|Query |Description
97+
98+
|`mtc_client_request_count`
99+
|Number of API requests issued, sorted by request type
100+
101+
|`sum(mtc_client_request_count)`
102+
|Total number of API requests issued
103+
104+
|`mtc_client_request_elapsed`
105+
|API request latency, sorted by request type
106+
107+
|`sum(mtc_client_request_elapsed)`
108+
|Total latency of API requests
109+
110+
|`sum(mtc_client_request_elapsed) / sum(mtc_client_request_count)`
111+
|Average latency of API requests
112+
113+
|`mtc_client_request_elapsed / mtc_client_request_count`
114+
|Average latency of API requests, sorted by request type
115+
116+
|`cam_app_workload_migrations{status="running"} * 100`
117+
|Count of running migrations, multiplied by 100 for easier viewing alongside request counts
118+
|===

0 commit comments

Comments
 (0)