Skip to content

Commit 62ce828

Browse files
Merge pull request #29668 from lbarbeevargas/OSDOCS-1818-upgrades-PVC-backed-Prometheus
OSDOCS-1818 for MON-1520 Resource usage of PVC backed Prometheus during upgrades
2 parents 6685c29 + 19973f4 commit 62ce828

File tree

7 files changed

+36
-4
lines changed

7 files changed

+36
-4
lines changed

monitoring/configuring-the-monitoring-stack.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,11 @@ Running cluster monitoring with persistent storage means that your metrics are s
7171

7272
[IMPORTANT]
7373
====
74+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
75+
====
76+
77+
[NOTE]
78+
====
7479
See xref:../scalability_and_performance/optimizing-storage.adoc#recommended-configurable-storage-technology_persistent-storage[Recommended configurable storage technology].
7580
====
7681

scalability_and_performance/scaling-cluster-monitoring-operator.adoc

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,12 @@ include::modules/common-attributes.adoc[]
55

66
toc::[]
77

8-
{product-title} exposes metrics that the Cluster Monitoring Operator collects
9-
and stores in the Prometheus-based monitoring stack. As an
10-
administrator, you can view system resources, containers and components metrics
11-
in one dashboard interface, Grafana.
8+
{product-title} exposes metrics that the Cluster Monitoring Operator collects and stores in the Prometheus-based monitoring stack. As an administrator, you can view system resources, containers and components metrics in one dashboard interface, Grafana.
9+
10+
[IMPORTANT]
11+
====
12+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
13+
====
1214

1315
include::modules/prometheus-database-storage-requirements.adoc[leveloffset=+1]
1416

updating/updating-cluster-between-minor.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,11 @@ See xref:../authentication/using-rbac.adoc[Using RBAC to define and apply permis
2525
Using the `unsupportedConfigOverrides` section to modify the configuration of an Operator is unsupported and might block cluster upgrades. You must remove this setting before you can upgrade your cluster.
2626
====
2727

28+
[IMPORTANT]
29+
====
30+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
31+
====
32+
2833
include::modules/update-service-overview.adoc[leveloffset=+1]
2934
.Additional resources
3035

updating/updating-cluster-cli.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,11 @@ See xref:../authentication/using-rbac.adoc[Using RBAC to define and apply permis
2020
Using the `unsupportedConfigOverrides` section to modify the configuration of an Operator is unsupported and might block cluster upgrades. You must remove this setting before you can upgrade your cluster.
2121
====
2222

23+
[IMPORTANT]
24+
====
25+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
26+
====
27+
2328
include::modules/update-service-overview.adoc[leveloffset=+1]
2429
.Additional resources
2530

updating/updating-cluster-rhel-compute.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,11 @@ See xref:../authentication/using-rbac.adoc[Using RBAC to define and apply permis
1616
* Have a recent xref:../backup_and_restore/backing-up-etcd.adoc#backup-etcd[etcd backup] in case your upgrade fails and you must xref:../backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.adoc#dr-restoring-cluster-state[restore your cluster to a previous state].
1717
* If your cluster uses manually maintained credentials, ensure that the Cloud Credential Operator (CCO) is in an upgradeable state. For more information, see _Upgrading clusters with manually maintained credentials_ for xref:../installing/installing_aws/manually-creating-iam.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-aws[AWS], xref:../installing/installing_azure/manually-creating-iam-azure.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-azure[Azure], or xref:../installing/installing_gcp/manually-creating-iam-gcp.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-gcp[GCP].
1818

19+
[IMPORTANT]
20+
====
21+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
22+
====
23+
1924
include::modules/update-service-overview.adoc[leveloffset=+1]
2025
.Additional resources
2126

updating/updating-cluster.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@ See xref:../authentication/using-rbac.adoc[Using RBAC to define and apply permis
1414
* Have a recent xref:../backup_and_restore/backing-up-etcd.adoc#backup-etcd[etcd backup] in case your upgrade fails and you must xref:../backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.adoc#dr-restoring-cluster-state[restore your cluster to a previous state].
1515
* If your cluster uses manually maintained credentials, ensure that the Cloud Credential Operator (CCO) is in an upgradeable state. For more information, see _Upgrading clusters with manually maintained credentials_ for xref:../installing/installing_aws/manually-creating-iam.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-aws[AWS], xref:../installing/installing_azure/manually-creating-iam-azure.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-azure[Azure], or xref:../installing/installing_gcp/manually-creating-iam-gcp.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-gcp[GCP].
1616

17+
[IMPORTANT]
18+
====
19+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
20+
====
21+
1722
include::modules/update-service-overview.adoc[leveloffset=+1]
1823
.Additional resources
1924

updating/updating-restricted-network-cluster.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,11 @@ See xref:../authentication/using-rbac.adoc[Using RBAC to define and apply permis
2121
* Have a recent xref:../backup_and_restore/backing-up-etcd.adoc#backup-etcd[etcd backup] in case your upgrade fails and you must xref:../backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.adoc#dr-restoring-cluster-state[restore your cluster to a previous state].
2222
* If your cluster uses manually maintained credentials, ensure that the Cloud Credential Operator (CCO) is in an upgradeable state. For more information, see _Upgrading clusters with manually maintained credentials_ for xref:../installing/installing_aws/manually-creating-iam.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-aws[AWS], xref:../installing/installing_azure/manually-creating-iam-azure.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-azure[Azure], or xref:../installing/installing_gcp/manually-creating-iam-gcp.adoc#manually-maintained-credentials-upgrade_manually-creating-iam-gcp[GCP].
2323

24+
[IMPORTANT]
25+
====
26+
If you are running cluster monitoring with an attached PVC for Prometheus, you might experience OOM kills during cluster upgrade. When persistent storage is in use for Prometheus, Prometheus memory usage doubles during cluster upgrade and for several hours after upgrade is complete. To avoid the OOM kill issue, allow worker nodes with double the size of memory that was available prior to the upgrade. For example, if you are running monitoring on the minimum recommended nodes, which is 2 cores with 8 GB of RAM, increase memory to 16 GB. For more information, see link:https://bugzilla.redhat.com/show_bug.cgi?id=1925061[BZ#1925061].
27+
====
28+
2429
[id="updating-restricted-network-mirror-host"]
2530
== Preparing your mirror host
2631

0 commit comments

Comments
 (0)