Skip to content

Commit d93ecfc

Browse files
committed
setting up standalone monitoring branch
1 parent 8147cb3 commit d93ecfc

File tree

174 files changed

+9493
-4
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

174 files changed

+9493
-4
lines changed

_topic_maps/_topic_map.yml

Lines changed: 92 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,95 @@
11
---
2-
Name: About OpenShift Standalone
3-
Dir: about
2+
Name: About OpenShift Container Platform monitoring
3+
Dir: about-ocp-monitoring
44
Distros: openshift-monitoring
55
Topics:
6-
- Name: OpenShift Standalone overview
7-
File: about-standalone
6+
- Name: About OpenShift Container Platform monitoring
7+
File: about-ocp-monitoring
8+
- Name: Monitoring stack architecture
9+
File: monitoring-stack-architecture
10+
- Name: Key concepts
11+
File: key-concepts
12+
---
13+
# Name: Release notes
14+
# Dir: release-notes
15+
# Distros: openshift-monitoring
16+
# Topics:
17+
# - Name: Monitoring release notes
18+
# File: monitoring-release-notes
19+
# ---
20+
Name: Getting started
21+
Dir: getting-started
22+
Distros: openshift-monitoring
23+
Topics:
24+
- Name: Maintenance and support for monitoring
25+
File: maintenance-and-support-for-monitoring
26+
- Name: Core platform monitoring first steps
27+
File: core-platform-monitoring-first-steps
28+
- Name: User workload monitoring first steps
29+
File: user-workload-monitoring-first-steps
30+
- Name: Developer and non-administrator steps
31+
File: developer-and-non-administrator-steps
32+
---
33+
Name: Configuring core platform monitoring
34+
Dir: configuring-core-platform-monitoring
35+
Distros: openshift-monitoring
36+
Topics:
37+
- Name: Preparing to configure the monitoring stack
38+
File: preparing-to-configure-the-monitoring-stack
39+
- Name: Configuring performance and scalability
40+
File: configuring-performance-and-scalability
41+
- Name: Storing and recording data
42+
File: storing-and-recording-data
43+
- Name: Configuring metrics
44+
File: configuring-metrics
45+
- Name: Configuring alerts and notifications
46+
File: configuring-alerts-and-notifications
47+
---
48+
Name: Configuring user workload monitoring
49+
Dir: configuring-user-workload-monitoring
50+
Distros: openshift-monitoring
51+
Topics:
52+
- Name: Preparing to configure the monitoring stack
53+
File: preparing-to-configure-the-monitoring-stack-uwm
54+
- Name: Configuring performance and scalability
55+
File: configuring-performance-and-scalability-uwm
56+
- Name: Storing and recording data
57+
File: storing-and-recording-data-uwm
58+
- Name: Configuring metrics
59+
File: configuring-metrics-uwm
60+
- Name: Configuring alerts and notifications
61+
File: configuring-alerts-and-notifications-uwm
62+
---
63+
Name: Accessing metrics
64+
Dir: accessing-metrics
65+
Distros: openshift-monitoring
66+
Topics:
67+
- Name: Accessing metrics as an administrator
68+
File: accessing-metrics-as-an-administrator
69+
- Name: Accessing metrics as a developer
70+
File: accessing-metrics-as-a-developer
71+
- Name: Accessing monitoring APIs by using the CLI
72+
File: accessing-monitoring-apis-by-using-the-cli
73+
---
74+
Name: Managing alerts
75+
Dir: managing-alerts
76+
Distros: openshift-monitoring
77+
Topics:
78+
- Name: Managing alerts as an administrator
79+
File: managing-alerts-as-an-administrator
80+
- Name: Managing alerts as a developer
81+
File: managing-alerts-as-a-developer
82+
---
83+
Name: Troubleshooting monitoring issues
84+
Dir: troubleshooting
85+
Distros: openshift-monitoring
86+
Topics:
87+
- Name: Troubleshooting monitoring issues
88+
File: troubleshooting-monitoring-issues
89+
---
90+
Name: Config map reference for the Cluster Monitoring Operator
91+
Dir: config-map-reference
92+
Distros: openshift-monitoring
93+
Topics:
94+
- Name: Config map reference for the Cluster Monitoring Operator
95+
File: config-map-reference-for-the-cluster-monitoring-operator

about-ocp-monitoring/_attributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../_attributes/
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="about-ocp-monitoring"]
4+
= About {product-title} monitoring
5+
:context: about-ocp-monitoring
6+
7+
toc::[]
8+
9+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
10+
{product-title} includes a preconfigured, preinstalled, and self-updating monitoring stack that provides monitoring for core platform components. You also have the option to xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-monitoring-for-user-defined-projects-uwm_preparing-to-configure-the-monitoring-stack-uwm[enable monitoring for user-defined projects].
11+
12+
A cluster administrator can xref:../configuring-core-platform-monitoring/preparing-to-configure-the-monitoring-stack.adoc#preparing-to-configure-the-monitoring-stack[configure the monitoring stack] with the supported configurations. {product-title} delivers monitoring best practices out of the box.
13+
14+
A set of alerts are included by default that immediately notify administrators about issues with a cluster. Default dashboards in the {product-title} web console include visual representations of cluster metrics to help you to quickly understand the state of your cluster. With the {product-title} web console, you can xref:../accessing-metrics/accessing-metrics-as-an-administrator.adoc#accessing-metrics-as-an-administrator[access metrics] and xref:../managing-alerts/managing-alerts-as-an-administrator.adoc#managing-alerts-as-an-administrator[manage alerts].
15+
16+
After installing {product-title}, cluster administrators can optionally enable monitoring for user-defined projects. By using this feature, cluster administrators, developers, and other users can specify how services and pods are monitored in their own projects.
17+
As a cluster administrator, you can find answers to common problems such as user metrics unavailability and high consumption of disk space by Prometheus in xref:../troubleshooting/troubleshooting-monitoring-issues.adoc#troubleshooting-monitoring-issues[Troubleshooting monitoring issues].
18+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
19+
20+
ifdef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
21+
In {product-title}, you can monitor your own projects in isolation from Red{nbsp}Hat Site Reliability Engineering (SRE) platform metrics. You can monitor your own projects without the need for an additional monitoring solution.
22+
23+
The {product-title}
24+
ifdef::openshift-rosa,openshift-rosa-hcp[]
25+
(ROSA)
26+
endif::openshift-rosa,openshift-rosa-hcp[]
27+
monitoring stack is based on the link:https://prometheus.io/[Prometheus] open source project and its wider ecosystem.
28+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
File renamed without changes.
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="key-concepts"]
4+
= Understanding the monitoring stack - key concepts
5+
:context: key-concepts
6+
7+
toc::[]
8+
9+
Get familiar with the {product-title} monitoring concepts and terms. Learn about how you can improve performance and scale of your cluster, store and record data, manage metrics and alerts, and more.
10+
11+
[id="about-performance-and-scalability_{context}"]
12+
== About performance and scalability
13+
14+
You can optimize the performance and scale of your clusters.
15+
You can configure the monitoring stack by performing any of the following actions:
16+
17+
* Control the placement and distribution of monitoring components:
18+
** Use node selectors to move components to specific nodes.
19+
** Assign tolerations to enable moving components to tainted nodes.
20+
* Use pod topology spread constraints.
21+
* Manage CPU and memory resources.
22+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
23+
* Set the body size limit for metrics scraping.
24+
* Use metrics collection profiles.
25+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
26+
27+
[role="_additional-resources"]
28+
.Additional resources
29+
30+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
31+
* xref:../configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc#configuring-performance-and-scalability[Configuring performance and scalability for core platform monitoring]
32+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
33+
* xref:../configuring-user-workload-monitoring/configuring-performance-and-scalability-uwm.adoc#configuring-performance-and-scalability-uwm[Configuring performance and scalability for user workload monitoring]
34+
35+
include::modules/monitoring-using-node-selectors-to-move-monitoring-components.adoc[leveloffset=+2]
36+
37+
include::modules/monitoring-using-pod-topology-spread-constraints-for-monitoring.adoc[leveloffset=+2]
38+
39+
include::modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2]
40+
41+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
42+
include::modules/monitoring-configuring-metrics-collection-profiles.adoc[leveloffset=+2]
43+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
44+
45+
[id="about-storing-and-recording-data_{context}"]
46+
== About storing and recording data
47+
48+
You can store and record data to help you protect the data and use them for troubleshooting.
49+
You can configure the monitoring stack by performing any of the following actions:
50+
51+
* Configure persistent storage:
52+
** Protect your metrics and alerting data from data loss by storing them in a persistent volume (PV). As a result, they can survive pods being restarted or recreated.
53+
** Avoid getting duplicate notifications and losing silences for alerts when the Alertmanager pods are restarted.
54+
* Modify the retention time and size for Prometheus and Thanos Ruler metrics data.
55+
* Configure logging to help you troubleshoot issues with your cluster:
56+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
57+
** Configure audit logs for Metrics Server.
58+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
59+
** Set log levels for monitoring.
60+
** Enable the query logging for Prometheus and Thanos Querier.
61+
62+
[role="_additional-resources"]
63+
.Additional resources
64+
65+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
66+
* xref:../configuring-core-platform-monitoring/storing-and-recording-data.adoc#storing-and-recording-data[Storing and recording data for core platform monitoring]
67+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
68+
* xref:../configuring-user-workload-monitoring/storing-and-recording-data-uwm.adoc#storing-and-recording-data-uwm[Storing and recording data for user workload monitoring]
69+
70+
include::modules/monitoring-retention-time-and-size-for-prometheus-metrics-data.adoc[leveloffset=+2]
71+
72+
// Understanding metrics
73+
include::modules/monitoring-understanding-metrics.adoc[leveloffset=+1]
74+
75+
[role="_additional-resources"]
76+
.Additional resources
77+
78+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
79+
* xref:../configuring-core-platform-monitoring/configuring-metrics.adoc#configuring-metrics[Configuring metrics for core platform monitoring]
80+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
81+
* xref:../configuring-user-workload-monitoring/configuring-metrics-uwm.adoc#configuring-metrics-uwm[Configuring metrics for user workload monitoring]
82+
* xref:../accessing-metrics/accessing-metrics-as-an-administrator.adoc#accessing-metrics-as-an-administrator[Accessing metrics as an administrator]
83+
* xref:../accessing-metrics/accessing-metrics-as-a-developer.adoc#accessing-metrics-as-a-developer[Accessing metrics as a developer]
84+
85+
include::modules/monitoring-controlling-the-impact-of-unbound-attributes-in-user-defined-projects.adoc[leveloffset=+2]
86+
87+
include::modules/monitoring-adding-cluster-id-labels-to-metrics.adoc[leveloffset=+2]
88+
89+
//About monitoring dashboards
90+
91+
include::modules/monitoring-about-monitoring-dashboards.adoc[leveloffset=+1]
92+
93+
[role="_additional-resources"]
94+
.Additional resources
95+
96+
* xref:../accessing-metrics/accessing-metrics-as-an-administrator.adoc#reviewing-monitoring-dashboards-admin_accessing-metrics-as-an-administrator[Reviewing monitoring dashboards as a cluster administrator]
97+
* xref:../accessing-metrics/accessing-metrics-as-a-developer.adoc#reviewing-monitoring-dashboards-developer_accessing-metrics-as-a-developer[Reviewing monitoring dashboards as a developer]
98+
99+
//Managing alerts
100+
include::modules/monitoring-about-managing-alerts.adoc[leveloffset=+1]
101+
102+
[role="_additional-resources"]
103+
.Additional resources
104+
105+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
106+
* xref:../configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc#configuring-alerts-and-notifications[Configuring alerts and notifications for core platform monitoring]
107+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
108+
* xref:../configuring-user-workload-monitoring/configuring-alerts-and-notifications-uwm.adoc#configuring-alerts-and-notifications-uwm[Configuring alerts and notifications for user workload monitoring]
109+
* xref:../managing-alerts/managing-alerts-as-an-administrator.adoc#managing-alerts-as-an-administrator[Managing alerts as an Administrator]
110+
* xref:../managing-alerts/managing-alerts-as-a-developer.adoc#managing-alerts-as-a-developer[Managing alerts as a Developer]
111+
112+
include::modules/monitoring-managing-silences.adoc[leveloffset=+2]
113+
114+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
115+
include::modules/monitoring-managing-core-platform-alerting-rules.adoc[leveloffset=+2]
116+
117+
include::modules/monitoring-tips-for-optimizing-alerting-rules-for-core-platform-monitoring.adoc[leveloffset=+2]
118+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
119+
120+
include::modules/monitoring-about-creating-alerting-rules-for-user-defined-projects.adoc[leveloffset=+2]
121+
122+
include::modules/monitoring-managing-alerting-rules-for-user-defined-projects.adoc[leveloffset=+2]
123+
124+
include::modules/monitoring-optimizing-alerting-for-user-defined-projects.adoc[leveloffset=+2]
125+
126+
include::modules/monitoring-searching-alerts-silences-and-alerting-rules.adoc[leveloffset=+2]
127+
128+
// Overview of setting up alert routing for user-defined projects
129+
include::modules/monitoring-understanding-alert-routing-for-user-defined-projects.adoc[leveloffset=+1]
130+
131+
[role="_additional-resources"]
132+
.Additional resources
133+
134+
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-alert-routing-for-user-defined-projects_preparing-to-configure-the-monitoring-stack-uwm[Enabling alert routing for user-defined projects]
135+
136+
// Sending notifications to external systems
137+
include::modules/monitoring-sending-notifications-to-external-systems.adoc[leveloffset=+1]
138+
139+
[role="_additional-resources"]
140+
.Additional resources
141+
142+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
143+
* xref:../configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc#configuring-alert-notifications_configuring-alerts-and-notifications[Configuring alert notifications for core platform monitoring]
144+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
145+
* xref:../configuring-user-workload-monitoring/configuring-alerts-and-notifications-uwm.adoc#configuring-alert-notifications_configuring-alerts-and-notifications-uwm[Configuring alert notifications for user workload monitoring]

about-ocp-monitoring/modules

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../modules/
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="monitoring-stack-architecture"]
4+
= Monitoring stack architecture
5+
:context: monitoring-stack-architecture
6+
7+
toc::[]
8+
9+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
10+
The {product-title} monitoring stack is based on the link:https://prometheus.io/[Prometheus] open source project and its wider ecosystem.
11+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
12+
The monitoring stack includes default monitoring components and components for monitoring user-defined projects.
13+
14+
// Understanding the monitoring stack
15+
include::modules/monitoring-understanding-the-monitoring-stack.adoc[leveloffset=+1]
16+
17+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
18+
//Default monitoring components
19+
include::modules/monitoring-default-monitoring-components.adoc[leveloffset=+1]
20+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
21+
22+
include::modules/monitoring-default-monitoring-targets.adoc[leveloffset=+2]
23+
24+
[role="_additional-resources"]
25+
.Additional resources
26+
* xref:../accessing-metrics/accessing-metrics-as-an-administrator.adoc#getting-detailed-information-about-a-target_accessing-metrics-as-an-administrator[Getting detailed information about a metrics target]
27+
28+
//Components for monitoring user-defined projects
29+
include::modules/monitoring-components-for-monitoring-user-defined-projects.adoc[leveloffset=+1]
30+
31+
include::modules/monitoring-targets-for-user-defined-projects.adoc[leveloffset=+2]
32+
33+
//The monitoring stack in high-availability clusters
34+
include::modules/monitoring-monitoring-stack-in-ha-clusters.adoc[leveloffset=+1]
35+
36+
[role="_additional-resources"]
37+
.Additional resources
38+
39+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
40+
* xref:../configuring-core-platform-monitoring/storing-and-recording-data.adoc#configuring-persistent-storage_storing-and-recording-data[Configuring persistent storage]
41+
* xref:../configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc#configuring-performance-and-scalability[Configuring performance and scalability]
42+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
43+
44+
ifdef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
45+
* xref:../configuring-user-workload-monitoring/storing-and-recording-data-uwm.adoc#configuring-persistent-storage_storing-and-recording-data-uwm[Configuring persistent storage]
46+
* xref:../configuring-user-workload-monitoring/configuring-performance-and-scalability-uwm.adoc#configuring-performance-and-scalability-uwm[Configuring performance and scalability]
47+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
48+
49+
//Glossary of common terms for OCP monitoring
50+
include::modules/monitoring-common-terms.adoc[leveloffset=+1]
51+
52+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
53+
[role="_additional-resources"]
54+
[id="additional-resources_{context}"]
55+
== Additional resources
56+
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/support/index#about-remote-health-monitoring[About remote health monitoring]
57+
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#granting-users-permission-to-monitor-user-defined-projects_preparing-to-configure-the-monitoring-stack-uwm[Granting users permissions for monitoring for user-defined projects]
58+
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/security_and_compliance/index#tls-security-profiles[Configuring TLS security profiles]
59+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

about-ocp-monitoring/snippets

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../snippets

accessing-metrics/_attributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../_attributes/
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
include::_attributes/common-attributes.adoc[]
3+
[id="accessing-metrics-as-a-developer"]
4+
= Accessing metrics as a developer
5+
:context: accessing-metrics-as-a-developer
6+
7+
toc::[]
8+
9+
You can access metrics to monitor the performance of your cluster workloads.
10+
11+
[role="_additional-resources"]
12+
.Additional resources
13+
14+
* xref:../about-ocp-monitoring/key-concepts.adoc#understanding-metrics_key-concepts[Understanding metrics]
15+
16+
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
17+
//Viewing a list of available metrics
18+
include::modules/monitoring-viewing-a-list-of-available-metrics.adoc[leveloffset=+1]
19+
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
20+
21+
//Querying metrics for user-defined projects with the OCP web console
22+
include::modules/monitoring-querying-metrics-for-user-defined-projects-with-mon-dashboard.adoc[leveloffset=+1]
23+
24+
[role="_additional-resources"]
25+
.Additional resources
26+
27+
* link:https://prometheus.io/docs/prometheus/latest/querying/basics/[Querying Prometheus (Prometheus documentation)]
28+
29+
//Reviewing monitoring dashboards as a developer
30+
include::modules/monitoring-reviewing-monitoring-dashboards-developer.adoc[leveloffset=+1]
31+
32+
[role="_additional-resources"]
33+
.Additional resources
34+
35+
* xref:../about-ocp-monitoring/key-concepts.adoc#about-monitoring-dashboards_key-concepts[About monitoring dashboards]
36+
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/building_applications/index#monitoring-project-and-application-metrics-using-developer-perspective[Monitoring project and application metrics using the Developer perspective]

0 commit comments

Comments
 (0)