Skip to content

Commit 2cdf972

Browse files
authored
Merge pull request #65341 from abrennan89/OBSDOCS-248
OBSDOCS-556: Updating logging alerts docs
2 parents cb1758b + e17923b commit 2cdf972

28 files changed

+816
-676
lines changed

_topic_maps/_topic_map.yml

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2552,6 +2552,15 @@ Topics:
25522552
File: cluster-logging-upgrading
25532553
- Name: Viewing cluster dashboards
25542554
File: cluster-logging-dashboards
2555+
- Name: Logging alerts
2556+
Dir: logging_alerts
2557+
Topics:
2558+
- Name: Default logging alerts
2559+
File: default-logging-alerts
2560+
- Name: Custom logging alerts
2561+
File: custom-logging-alerts
2562+
- Name: Troubleshooting logging alerts
2563+
File: troubleshooting-logging-alerts
25552564
- Name: Troubleshooting Logging
25562565
Dir: troubleshooting
25572566
Distros: openshift-enterprise,openshift-origin
@@ -2560,10 +2569,6 @@ Topics:
25602569
File: cluster-logging-cluster-status
25612570
- Name: Viewing the status of the log store
25622571
File: cluster-logging-log-store-status
2563-
- Name: Understanding Logging alerts
2564-
File: cluster-logging-alerts
2565-
- Name: Troubleshooting for Critical Alerts
2566-
File: cluster-logging-troubleshooting-for-critical-alerts
25672572
- Name: Uninstalling Logging
25682573
File: cluster-logging-uninstall
25692574
- Name: Exported fields

_topic_maps/_topic_map_osd.yml

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -700,17 +700,22 @@ Topics:
700700
File: cluster-logging-upgrading
701701
- Name: Viewing cluster dashboards
702702
File: cluster-logging-dashboards
703+
- Name: Logging alerts
704+
Dir: logging_alerts
705+
Topics:
706+
- Name: Default logging alerts
707+
File: default-logging-alerts
708+
- Name: Custom logging alerts
709+
File: custom-logging-alerts
710+
- Name: Troubleshooting logging alerts
711+
File: troubleshooting-logging-alerts
703712
- Name: Troubleshooting Logging
704713
Dir: troubleshooting
705714
Topics:
706715
- Name: Viewing Logging status
707716
File: cluster-logging-cluster-status
708717
- Name: Viewing the status of the log store
709718
File: cluster-logging-log-store-status
710-
- Name: Understanding Logging alerts
711-
File: cluster-logging-alerts
712-
- Name: Troubleshooting for Critical Alerts
713-
File: cluster-logging-troubleshooting-for-critical-alerts
714719
- Name: Uninstalling Logging
715720
File: cluster-logging-uninstall
716721
- Name: Exported fields

_topic_maps/_topic_map_rosa.yml

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -869,17 +869,22 @@ Topics:
869869
File: cluster-logging-upgrading
870870
- Name: Viewing cluster dashboards
871871
File: cluster-logging-dashboards
872+
- Name: Logging alerts
873+
Dir: logging_alerts
874+
Topics:
875+
- Name: Default logging alerts
876+
File: default-logging-alerts
877+
- Name: Custom logging alerts
878+
File: custom-logging-alerts
879+
- Name: Troubleshooting logging alerts
880+
File: troubleshooting-logging-alerts
872881
- Name: Troubleshooting Logging
873882
Dir: troubleshooting
874883
Topics:
875884
- Name: Viewing Logging status
876885
File: cluster-logging-cluster-status
877886
- Name: Viewing the status of the log store
878887
File: cluster-logging-log-store-status
879-
- Name: Understanding Logging alerts
880-
File: cluster-logging-alerts
881-
- Name: Troubleshooting for Critical Alerts
882-
File: cluster-logging-troubleshooting-for-critical-alerts
883888
- Name: Uninstalling Logging
884889
File: cluster-logging-uninstall
885890
- Name: Exported fields

logging/logging_alerts/_attributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../_attributes/
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
:_content-type: ASSEMBLY
2+
[id="custom-logging-alerts"]
3+
include::_attributes/common-attributes.adoc[]
4+
= Custom logging alerts
5+
:context: custom-logging-alerts
6+
7+
toc::[]
8+
9+
In logging 5.7 and later versions, users can configure the LokiStack deployment to produce customized alerts and recorded metrics. If you want to use customized link:https://grafana.com/docs/loki/latest/alert/[alerting and recording rules], you must enable the LokiStack ruler component.
10+
11+
LokiStack log-based alerts and recorded metrics are triggered by providing link:https://grafana.com/docs/loki/latest/query/[LogQL] expressions to the ruler component. The Loki Operator manages a ruler that is optimized for the selected LokiStack size, which can be `1x.extra-small`, `1x.small`, or `1x.medium`.
12+
13+
[NOTE]
14+
====
15+
The `1x.extra-small` size is not supported. It is for demonstration purposes only.
16+
====
17+
18+
To provide these expressions, you must create an `AlertingRule` custom resource (CR) containing Prometheus-compatible link:https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/[alerting rules], or a `RecordingRule` CR containing Prometheus-compatible link:https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/[recording rules].
19+
20+
Administrators can configure log-based alerts or recorded metrics for `application`, `audit`, or `infrastructure` tenants. Users without administrator permissions can configure log-based alerts or recorded metrics for `application` tenants of the applications that they have access to.
21+
22+
Application, audit, and infrastructure alerts are sent by default to the {product-title} monitoring stack Alertmanager in the `openshift-monitoring` namespace, unless you have disabled the local Alertmanager instance. If the Alertmanager that is used to monitor user-defined projects in the `openshift-user-workload-monitoring` namespace is enabled, application alerts are sent to the Alertmanager in this namespace by default.
23+
24+
include::modules/configuring-logging-loki-ruler.adoc[leveloffset=+1]
25+
include::modules/loki-rbac-permissions.adoc[leveloffset=+1]
26+
27+
[role="_additional-resources"]
28+
.Additional resources
29+
* xref:../../authentication/using-rbac.adoc#using-rbac[Using RBAC to define and apply permissions]
30+
31+
include::modules/logging-enabling-loki-alerts.adoc[leveloffset=+1]
32+
33+
[role="_additional-resources"]
34+
[id="additional-resources_custom-logging-alerts"]
35+
== Additional resources
36+
* xref:../../monitoring/monitoring-overview.adoc#about-openshift-monitoring_monitoring-overview[About {product-title} monitoring]
37+
ifdef::openshift-enterprise[]
38+
* xref:../../post_installation_configuration/configuring-alert-notifications.adoc#configuring-alert-notifications[Configuring alert notifications]
39+
endif::[]
40+
// maybe need an update to https://docs.openshift.com/container-platform/4.13/monitoring/monitoring-overview.html#default-monitoring-targets_monitoring-overview to talk about Loki and Vector now? Are these part of default monitoring?
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
:_content-type: ASSEMBLY
2+
[id="default-logging-alerts"]
3+
include::_attributes/common-attributes.adoc[]
4+
= Default logging alerts
5+
:context: default-logging-alerts
6+
7+
toc::[]
8+
9+
Logging alerts are installed as part of the Cluster Logging Operator installation. Alerts depend on metrics exported by the log collection and log storage backends. These metrics are enabled if you selected the option to *Enable operator recommended cluster monitoring on this namespace* when installing the Cluster Logging Operator. For more information about installing logging Operators, see xref:../../logging/cluster-logging-deploying#cluster-logging-deploy-console_cluster-logging-deploying[Installing the {logging-title} using the web console].
10+
11+
Default logging alerts are sent to the {product-title} monitoring stack Alertmanager in the `openshift-monitoring` namespace, unless you have disabled the local Alertmanager instance.
12+
13+
include::modules/monitoring-accessing-the-alerting-ui.adoc[leveloffset=+1]
14+
include::modules/logging-vector-collector-alerts.adoc[leveloffset=+1]
15+
include::modules/logging-fluentd-collector-alerts.adoc[leveloffset=+1]
16+
include::modules/cluster-logging-elasticsearch-rules.adoc[leveloffset=+1]
17+
18+
[role="_additional-resources"]
19+
[id="additional-resources_default-logging-alerts"]
20+
== Additional resources
21+
* xref:../../monitoring/managing-alerts.adoc#modifying-core-platform-alerting-rules_managing-alerts[Modifying core platform alerting rules]

logging/logging_alerts/images

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../images/

logging/logging_alerts/modules

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../modules/

logging/logging_alerts/snippets

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../snippets/
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
:_content-type: ASSEMBLY
2+
[id="troubleshooting-logging-alerts"]
3+
include::_attributes/common-attributes.adoc[]
4+
= Troubleshooting logging alerts
5+
:context: troubleshooting-logging-alerts
6+
7+
toc::[]
8+
9+
You can use the following procedures to troubleshoot logging alerts on your cluster.
10+
11+
include::modules/es-cluster-health-is-red.adoc[leveloffset=+1]
12+
13+
[role="_additional-resources"]
14+
.Additional resources
15+
* xref:../../monitoring/reviewing-monitoring-dashboards.adoc#reviewing-monitoring-dashboards[Reviewing monitoring dashboards]
16+
* link:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/fix-common-cluster-issues.html#fix-red-yellow-cluster-status[Fix a red or yellow cluster status]
17+
18+
[id="elasticsearch-cluster-health-is-yellow"]
19+
== Elasticsearch cluster health status is yellow
20+
21+
Replica shards for at least one primary shard are not allocated to nodes. Increase the node count by adjusting the `nodeCount` value in the `ClusterLogging` custom resource (CR).
22+
23+
[role="_additional-resources"]
24+
.Additional resources
25+
* link:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/fix-common-cluster-issues.html#fix-red-yellow-cluster-status[Fix a red or yellow cluster status]
26+
27+
include::modules/es-node-disk-low-watermark-reached.adoc[leveloffset=+1]
28+
include::modules/es-node-disk-high-watermark-reached.adoc[leveloffset=+1]
29+
include::modules/es-node-disk-flood-watermark-reached.adoc[leveloffset=+1]
30+
31+
[id="troubleshooting-logging-alerts-es-jvm-heap-use-is-high"]
32+
== Elasticsearch JVM heap usage is high
33+
34+
The Elasticsearch node Java virtual machine (JVM) heap memory used is above 75%. Consider https://www.elastic.co/guide/en/elasticsearch/reference/current/advanced-configuration.html#set-jvm-heap-size[increasing the heap size].
35+
36+
[id="troubleshooting-logging-alerts-aggregated-logging-system-cpu-is-high"]
37+
== Aggregated logging system CPU is high
38+
39+
System CPU usage on the node is high. Check the CPU of the cluster node. Consider allocating more CPU resources to the node.
40+
41+
[id="troubleshooting-logging-alerts-es-process-cpu-is-high"]
42+
== Elasticsearch process CPU is high
43+
44+
Elasticsearch process CPU usage on the node is high. Check the CPU of the cluster node. Consider allocating more CPU resources to the node.
45+
46+
include::modules/es-disk-space-low.adoc[leveloffset=+1]
47+
48+
[role="_additional-resources"]
49+
.Additional resources
50+
* link:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/fix-common-cluster-issues.html#fix-red-yellow-cluster-status[Fix a red or yellow cluster status]
51+
52+
[id="troubleshooting-logging-alerts-es-filedescriptor-usage-is-high"]
53+
== Elasticsearch FileDescriptor usage is high
54+
55+
Based on current usage trends, the predicted number of file descriptors on the node is insufficient. Check the value of `max_file_descriptors` for each node as described in the Elasticsearch link:https://www.elastic.co/guide/en/elasticsearch/reference/6.8/file-descriptors.html[File Descriptors] documentation.

0 commit comments

Comments
 (0)