Skip to content

Commit b5b31aa

Browse files
authored
Merge pull request #51059 from bburt-rh/RHDEVDOCS-4472-new-sample-code-for-how-to-view-telemetry-metrics
RHDEVDOCS-4472 -Add query to view metrics collected by Telemetry
2 parents 6bcf712 + 9710832 commit b5b31aa

File tree

1 file changed

+155
-15
lines changed

1 file changed

+155
-15
lines changed

modules/telemetry-showing-data-collected-from-the-cluster.adoc

Lines changed: 155 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,28 +6,168 @@
66
[id="showing-data-collected-from-the-cluster_{context}"]
77
= Showing data collected by Telemetry
88

9-
You can see the cluster and components time series data captured by Telemetry.
9+
You can view the cluster and components time series data captured by Telemetry.
1010

1111
.Prerequisites
1212

13-
* Install the OpenShift CLI (`oc`).
14-
* You must log in to the cluster with a user that has either the `cluster-admin` role or the `cluster-monitoring-view` role.
13+
* You have installed the {product-title} CLI (`oc`).
14+
* You have access to the cluster as a user with the `cluster-admin` role or the `cluster-monitoring-view` role.
1515
1616
.Procedure
1717

18-
. Find the URL for the Prometheus service that runs in the {product-title} cluster:
19-
+
20-
[source,terminal]
21-
----
22-
$ oc get route prometheus-k8s -n openshift-monitoring -o jsonpath="{.spec.host}"
23-
----
18+
. Log in to a cluster.
2419

25-
. Navigate to the URL.
26-
27-
. Enter this query in the *Expression* input box and press *Execute*:
20+
. Run the following command, which queries a cluster's Prometheus service and returns the full set of time series data captured by Telemetry:
2821
+
22+
[source,terminal]
2923
----
30-
{__name__=~"cluster:usage:.*|count:up0|count:up1|cluster_version|cluster_version_available_updates|cluster_operator_up|cluster_operator_conditions|cluster_version_payload|cluster_installer|cluster_infrastructure_provider|cluster_feature_set|instance:etcd_object_counts:sum|ALERTS|code:apiserver_request_total:rate:sum|cluster:capacity_cpu_cores:sum|cluster:capacity_memory_bytes:sum|cluster:cpu_usage_cores:sum|cluster:memory_usage_bytes:sum|openshift:cpu_usage_cores:sum|openshift:memory_usage_bytes:sum|workload:cpu_usage_cores:sum|workload:memory_usage_bytes:sum|cluster:virt_platform_nodes:sum|cluster:node_instance_type_count:sum|cnv:vmi_status_running:count|cluster:vmi_request_cpu_cores:sum|node_role_os_version_machine:cpu_capacity_cores:sum|node_role_os_version_machine:cpu_capacity_sockets:sum|subscription_sync_total|olm_resolution_duration_seconds|csv_succeeded|csv_abnormal|cluster:kube_persistentvolumeclaim_resource_requests_storage_bytes:provisioner:sum|cluster:kubelet_volume_stats_used_bytes:provisioner:sum|ceph_cluster_total_bytes|ceph_cluster_total_used_raw_bytes|ceph_health_status|job:ceph_osd_metadata:count|job:kube_pv:count|job:ceph_pools_iops:total|job:ceph_pools_iops_bytes:total|job:ceph_versions_running:count|job:noobaa_total_unhealthy_buckets:sum|job:noobaa_bucket_count:sum|job:noobaa_total_object_count:sum|noobaa_accounts_num|noobaa_total_usage|console_url|cluster:network_attachment_definition_instances:max|cluster:network_attachment_definition_enabled_instance_up:max|cluster:ingress_controller_aws_nlb_active:sum|insightsclient_request_send_total|cam_app_workload_migrations|cluster:apiserver_current_inflight_requests:sum:max_over_time:2m|cluster:alertmanager_integrations:max|cluster:telemetry_selected_series:count|openshift:prometheus_tsdb_head_series:sum|openshift:prometheus_tsdb_head_samples_appended_total:sum|monitoring:container_memory_working_set_bytes:sum|namespace_job:scrape_series_added:topk3_sum1h|namespace_job:scrape_samples_post_metric_relabeling:topk3|monitoring:haproxy_server_http_responses_total:sum|rhmi_status|cluster_legacy_scheduler_policy|cluster_master_schedulable|che_workspace_status|che_workspace_started_total|che_workspace_failure_total|che_workspace_start_time_seconds_sum|che_workspace_start_time_seconds_count|cco_credentials_mode|cluster:kube_persistentvolume_plugin_type_counts:sum|visual_web_terminal_sessions_total|acm_managed_cluster_info|cluster:vsphere_vcenter_info:sum|cluster:vsphere_esxi_version_total:sum|cluster:vsphere_node_hw_version_total:sum|openshift:build_by_strategy:sum|rhods_aggregate_availability|rhods_total_users|instance:etcd_disk_wal_fsync_duration_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_bytes:sum|instance:etcd_network_peer_round_trip_time_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_use_in_bytes:sum|instance:etcd_disk_backend_commit_duration_seconds:histogram_quantile|jaeger_operator_instances_storage_types|jaeger_operator_instances_strategies|jaeger_operator_instances_agent_strategies|appsvcs:cores_by_product:sum|nto_custom_profiles:count|openshift_csi_share_configmap|openshift_csi_share_secret|openshift_csi_share_mount_failures_total|openshift_csi_share_mount_requests_total",alertstate=~"firing|"}
24+
$ curl -G -k -H "Authorization: Bearer $(oc whoami -t)" \
25+
https://$(oc get route prometheus-k8s-federate -n \
26+
openshift-monitoring -o jsonpath="{.spec.host}")/federate \
27+
--data-urlencode 'match[]={__name__=~"cluster:usage:.*"}' \
28+
--data-urlencode 'match[]={__name__="count:up0"}' \
29+
--data-urlencode 'match[]={__name__="count:up1"}' \
30+
--data-urlencode 'match[]={__name__="cluster_version"}' \
31+
--data-urlencode 'match[]={__name__="cluster_version_available_updates"}' \
32+
--data-urlencode 'match[]={__name__="cluster_version_capability"}' \
33+
--data-urlencode 'match[]={__name__="cluster_operator_up"}' \
34+
--data-urlencode 'match[]={__name__="cluster_operator_conditions"}' \
35+
--data-urlencode 'match[]={__name__="cluster_version_payload"}' \
36+
--data-urlencode 'match[]={__name__="cluster_installer"}' \
37+
--data-urlencode 'match[]={__name__="cluster_infrastructure_provider"}' \
38+
--data-urlencode 'match[]={__name__="cluster_feature_set"}' \
39+
--data-urlencode 'match[]={__name__="instance:etcd_object_counts:sum"}' \
40+
--data-urlencode 'match[]={__name__="ALERTS",alertstate="firing"}' \
41+
--data-urlencode 'match[]={__name__="code:apiserver_request_total:rate:sum"}' \
42+
--data-urlencode 'match[]={__name__="cluster:capacity_cpu_cores:sum"}' \
43+
--data-urlencode 'match[]={__name__="cluster:capacity_memory_bytes:sum"}' \
44+
--data-urlencode 'match[]={__name__="cluster:cpu_usage_cores:sum"}' \
45+
--data-urlencode 'match[]={__name__="cluster:memory_usage_bytes:sum"}' \
46+
--data-urlencode 'match[]={__name__="openshift:cpu_usage_cores:sum"}' \
47+
--data-urlencode 'match[]={__name__="openshift:memory_usage_bytes:sum"}' \
48+
--data-urlencode 'match[]={__name__="workload:cpu_usage_cores:sum"}' \
49+
--data-urlencode 'match[]={__name__="workload:memory_usage_bytes:sum"}' \
50+
--data-urlencode 'match[]={__name__="cluster:virt_platform_nodes:sum"}' \
51+
--data-urlencode 'match[]={__name__="cluster:node_instance_type_count:sum"}' \
52+
--data-urlencode 'match[]={__name__="cnv:vmi_status_running:count"}' \
53+
--data-urlencode 'match[]={__name__="cluster:vmi_request_cpu_cores:sum"}' \
54+
--data-urlencode 'match[]={__name__="node_role_os_version_machine:cpu_capacity_cores:sum"}' \
55+
--data-urlencode 'match[]={__name__="node_role_os_version_machine:cpu_capacity_sockets:sum"}' \
56+
--data-urlencode 'match[]={__name__="subscription_sync_total"}' \
57+
--data-urlencode 'match[]={__name__="olm_resolution_duration_seconds"}' \
58+
--data-urlencode 'match[]={__name__="csv_succeeded"}' \
59+
--data-urlencode 'match[]={__name__="csv_abnormal"}' \
60+
--data-urlencode 'match[]={__name__="cluster:kube_persistentvolumeclaim_resource_requests_storage_bytes:provisioner:sum"}' \
61+
--data-urlencode 'match[]={__name__="cluster:kubelet_volume_stats_used_bytes:provisioner:sum"}' \
62+
--data-urlencode 'match[]={__name__="ceph_cluster_total_bytes"}' \
63+
--data-urlencode 'match[]={__name__="ceph_cluster_total_used_raw_bytes"}' \
64+
--data-urlencode 'match[]={__name__="ceph_health_status"}' \
65+
--data-urlencode 'match[]={__name__="odf_system_raw_capacity_total_bytes"}' \
66+
--data-urlencode 'match[]={__name__="odf_system_raw_capacity_used_bytes"}' \
67+
--data-urlencode 'match[]={__name__="odf_system_health_status"}' \
68+
--data-urlencode 'match[]={__name__="job:ceph_osd_metadata:count"}' \
69+
--data-urlencode 'match[]={__name__="job:kube_pv:count"}' \
70+
--data-urlencode 'match[]={__name__="job:odf_system_pvs:count"}' \
71+
--data-urlencode 'match[]={__name__="job:ceph_pools_iops:total"}' \
72+
--data-urlencode 'match[]={__name__="job:ceph_pools_iops_bytes:total"}' \
73+
--data-urlencode 'match[]={__name__="job:ceph_versions_running:count"}' \
74+
--data-urlencode 'match[]={__name__="job:noobaa_total_unhealthy_buckets:sum"}' \
75+
--data-urlencode 'match[]={__name__="job:noobaa_bucket_count:sum"}' \
76+
--data-urlencode 'match[]={__name__="job:noobaa_total_object_count:sum"}' \
77+
--data-urlencode 'match[]={__name__="odf_system_bucket_count", system_type="OCS", system_vendor="Red Hat"}' \
78+
--data-urlencode 'match[]={__name__="odf_system_objects_total", system_type="OCS", system_vendor="Red Hat"}' \
79+
--data-urlencode 'match[]={__name__="noobaa_accounts_num"}' \
80+
--data-urlencode 'match[]={__name__="noobaa_total_usage"}' \
81+
--data-urlencode 'match[]={__name__="console_url"}' \
82+
--data-urlencode 'match[]={__name__="cluster:ovnkube_master_egress_routing_via_host:max"}' \
83+
--data-urlencode 'match[]={__name__="cluster:network_attachment_definition_instances:max"}' \
84+
--data-urlencode 'match[]={__name__="cluster:network_attachment_definition_enabled_instance_up:max"}' \
85+
--data-urlencode 'match[]={__name__="cluster:ingress_controller_aws_nlb_active:sum"}' \
86+
--data-urlencode 'match[]={__name__="cluster:route_metrics_controller_routes_per_shard:min"}' \
87+
--data-urlencode 'match[]={__name__="cluster:route_metrics_controller_routes_per_shard:max"}' \
88+
--data-urlencode 'match[]={__name__="cluster:route_metrics_controller_routes_per_shard:avg"}' \
89+
--data-urlencode 'match[]={__name__="cluster:route_metrics_controller_routes_per_shard:median"}' \
90+
--data-urlencode 'match[]={__name__="cluster:openshift_route_info:tls_termination:sum"}' \
91+
--data-urlencode 'match[]={__name__="insightsclient_request_send_total"}' \
92+
--data-urlencode 'match[]={__name__="cam_app_workload_migrations"}' \
93+
--data-urlencode 'match[]={__name__="cluster:apiserver_current_inflight_requests:sum:max_over_time:2m"}' \
94+
--data-urlencode 'match[]={__name__="cluster:alertmanager_integrations:max"}' \
95+
--data-urlencode 'match[]={__name__="cluster:telemetry_selected_series:count"}' \
96+
--data-urlencode 'match[]={__name__="openshift:prometheus_tsdb_head_series:sum"}' \
97+
--data-urlencode 'match[]={__name__="openshift:prometheus_tsdb_head_samples_appended_total:sum"}' \
98+
--data-urlencode 'match[]={__name__="monitoring:container_memory_working_set_bytes:sum"}' \
99+
--data-urlencode 'match[]={__name__="namespace_job:scrape_series_added:topk3_sum1h"}' \
100+
--data-urlencode 'match[]={__name__="namespace_job:scrape_samples_post_metric_relabeling:topk3"}' \
101+
--data-urlencode 'match[]={__name__="monitoring:haproxy_server_http_responses_total:sum"}' \
102+
--data-urlencode 'match[]={__name__="rhmi_status"}' \
103+
--data-urlencode 'match[]={__name__="status:upgrading:version:rhoam_state:max"}' \
104+
--data-urlencode 'match[]={__name__="state:rhoam_critical_alerts:max"}' \
105+
--data-urlencode 'match[]={__name__="state:rhoam_warning_alerts:max"}' \
106+
--data-urlencode 'match[]={__name__="rhoam_7d_slo_percentile:max"}' \
107+
--data-urlencode 'match[]={__name__="rhoam_7d_slo_remaining_error_budget:max"}' \
108+
--data-urlencode 'match[]={__name__="cluster_legacy_scheduler_policy"}' \
109+
--data-urlencode 'match[]={__name__="cluster_master_schedulable"}' \
110+
--data-urlencode 'match[]={__name__="che_workspace_status"}' \
111+
--data-urlencode 'match[]={__name__="che_workspace_started_total"}' \
112+
--data-urlencode 'match[]={__name__="che_workspace_failure_total"}' \
113+
--data-urlencode 'match[]={__name__="che_workspace_start_time_seconds_sum"}' \
114+
--data-urlencode 'match[]={__name__="che_workspace_start_time_seconds_count"}' \
115+
--data-urlencode 'match[]={__name__="cco_credentials_mode"}' \
116+
--data-urlencode 'match[]={__name__="cluster:kube_persistentvolume_plugin_type_counts:sum"}' \
117+
--data-urlencode 'match[]={__name__="visual_web_terminal_sessions_total"}' \
118+
--data-urlencode 'match[]={__name__="acm_managed_cluster_info"}' \
119+
--data-urlencode 'match[]={__name__="cluster:vsphere_vcenter_info:sum"}' \
120+
--data-urlencode 'match[]={__name__="cluster:vsphere_esxi_version_total:sum"}' \
121+
--data-urlencode 'match[]={__name__="cluster:vsphere_node_hw_version_total:sum"}' \
122+
--data-urlencode 'match[]={__name__="openshift:build_by_strategy:sum"}' \
123+
--data-urlencode 'match[]={__name__="rhods_aggregate_availability"}' \
124+
--data-urlencode 'match[]={__name__="rhods_total_users"}' \
125+
--data-urlencode 'match[]={__name__="instance:etcd_disk_wal_fsync_duration_seconds:histogram_quantile",quantile="0.99"}' \
126+
--data-urlencode 'match[]={__name__="instance:etcd_mvcc_db_total_size_in_bytes:sum"}' \
127+
--data-urlencode 'match[]={__name__="instance:etcd_network_peer_round_trip_time_seconds:histogram_quantile",quantile="0.99"}' \
128+
--data-urlencode 'match[]={__name__="instance:etcd_mvcc_db_total_size_in_use_in_bytes:sum"}' \
129+
--data-urlencode 'match[]={__name__="instance:etcd_disk_backend_commit_duration_seconds:histogram_quantile",quantile="0.99"}' \
130+
--data-urlencode 'match[]={__name__="jaeger_operator_instances_storage_types"}' \
131+
--data-urlencode 'match[]={__name__="jaeger_operator_instances_strategies"}' \
132+
--data-urlencode 'match[]={__name__="jaeger_operator_instances_agent_strategies"}' \
133+
--data-urlencode 'match[]={__name__="appsvcs:cores_by_product:sum"}' \
134+
--data-urlencode 'match[]={__name__="nto_custom_profiles:count"}' \
135+
--data-urlencode 'match[]={__name__="openshift_csi_share_configmap"}' \
136+
--data-urlencode 'match[]={__name__="openshift_csi_share_secret"}' \
137+
--data-urlencode 'match[]={__name__="openshift_csi_share_mount_failures_total"}' \
138+
--data-urlencode 'match[]={__name__="openshift_csi_share_mount_requests_total"}' \
139+
--data-urlencode 'match[]={__name__="cluster:velero_backup_total:max"}' \
140+
--data-urlencode 'match[]={__name__="cluster:velero_restore_total:max"}' \
141+
--data-urlencode 'match[]={__name__="eo_es_storage_info"}' \
142+
--data-urlencode 'match[]={__name__="eo_es_redundancy_policy_info"}' \
143+
--data-urlencode 'match[]={__name__="eo_es_defined_delete_namespaces_total"}' \
144+
--data-urlencode 'match[]={__name__="eo_es_misconfigured_memory_resources_info"}' \
145+
--data-urlencode 'match[]={__name__="cluster:eo_es_data_nodes_total:max"}' \
146+
--data-urlencode 'match[]={__name__="cluster:eo_es_documents_created_total:sum"}' \
147+
--data-urlencode 'match[]={__name__="cluster:eo_es_documents_deleted_total:sum"}' \
148+
--data-urlencode 'match[]={__name__="pod:eo_es_shards_total:max"}' \
149+
--data-urlencode 'match[]={__name__="eo_es_cluster_management_state_info"}' \
150+
--data-urlencode 'match[]={__name__="imageregistry:imagestreamtags_count:sum"}' \
151+
--data-urlencode 'match[]={__name__="imageregistry:operations_count:sum"}' \
152+
--data-urlencode 'match[]={__name__="log_logging_info"}' \
153+
--data-urlencode 'match[]={__name__="log_collector_error_count_total"}' \
154+
--data-urlencode 'match[]={__name__="log_forwarder_pipeline_info"}' \
155+
--data-urlencode 'match[]={__name__="log_forwarder_input_info"}' \
156+
--data-urlencode 'match[]={__name__="log_forwarder_output_info"}' \
157+
--data-urlencode 'match[]={__name__="cluster:log_collected_bytes_total:sum"}' \
158+
--data-urlencode 'match[]={__name__="cluster:log_logged_bytes_total:sum"}' \
159+
--data-urlencode 'match[]={__name__="cluster:kata_monitor_running_shim_count:sum"}' \
160+
--data-urlencode 'match[]={__name__="platform:hypershift_hostedclusters:max"}' \
161+
--data-urlencode 'match[]={__name__="platform:hypershift_nodepools:max"}' \
162+
--data-urlencode 'match[]={__name__="namespace:noobaa_unhealthy_bucket_claims:max"}' \
163+
--data-urlencode 'match[]={__name__="namespace:noobaa_buckets_claims:max"}' \
164+
--data-urlencode 'match[]={__name__="namespace:noobaa_unhealthy_namespace_resources:max"}' \
165+
--data-urlencode 'match[]={__name__="namespace:noobaa_namespace_resources:max"}' \
166+
--data-urlencode 'match[]={__name__="namespace:noobaa_unhealthy_namespace_buckets:max"}' \
167+
--data-urlencode 'match[]={__name__="namespace:noobaa_namespace_buckets:max"}' \
168+
--data-urlencode 'match[]={__name__="namespace:noobaa_accounts:max"}' \
169+
--data-urlencode 'match[]={__name__="namespace:noobaa_usage:max"}' \
170+
--data-urlencode 'match[]={__name__="namespace:noobaa_system_health_status:max"}' \
171+
--data-urlencode 'match[]={__name__="ocs_advanced_feature_usage"}' \
172+
--data-urlencode 'match[]={__name__="os_image_url_override:sum"}'
31173
----
32-
+
33-
This query replicates the request that Telemetry makes against a running {product-title} cluster's Prometheus service and returns the full set of time series captured by Telemetry.

0 commit comments

Comments
 (0)