|
1 | | -# Getting started with genestack monitoring |
| 1 | +# Getting Started with Genestack Monitoring |
2 | 2 |
|
3 | | -In order to begin monitoring your genestack deployment we first need to deploy the core prometheus components |
| 3 | +This guide walks you through setting up a complete monitoring stack for your Genestack deployment. The monitoring system consists of three main layers: metrics collection, visualization, and alerting. |
4 | 4 |
|
5 | | -## Install the Prometheus stack |
| 5 | +## Overview |
6 | 6 |
|
7 | | -Install [Prometheus](prometheus.md) which is part of the kube-prometheus-stack and includes: |
| 7 | +The Genestack monitoring stack includes: |
8 | 8 |
|
9 | | -* Prometheus and the Prometheus operator to manage the Prometheus cluster deployment |
10 | | -* AlertManager which allows for alerting configurations to be set in order to notify various services like email or PagerDuty for specified alerting thresholds |
| 9 | +- **Prometheus** - Time-series database and metrics collection engine |
| 10 | +- **Grafana** - Visualization and dashboards |
| 11 | +- **AlertManager** - Alert routing and notification management |
| 12 | +- **Metric Exporters** - Service-specific metrics collection for OpenStack components |
11 | 13 |
|
12 | | -The [Prometheus](prometheus.md) kube-prometheus-stack will also deploy a couple core metric exporters as part of the stack, those include: |
| 14 | +## Prerequisites |
13 | 15 |
|
14 | | -* Node Exporter(Hardware metrics) |
15 | | -* Kube State Exporter(Kubernetes cluster metrics) |
| 16 | +Before proceeding, ensure you have: |
16 | 17 |
|
17 | | -## Install Grafana |
| 18 | +- A running Genestack deployment |
| 19 | +- Helm 3.x installed |
| 20 | +- Access to your Kubernetes cluster with appropriate permissions |
18 | 21 |
|
19 | | -We can then deploy our visualization dashboard Grafana |
| 22 | +## Step 1: Install the Prometheus Stack |
20 | 23 |
|
21 | | -* [Install Grafana](grafana.md) |
| 24 | +The kube-prometheus-stack is the foundation of your monitoring infrastructure. It deploys and manages the core monitoring components. |
22 | 25 |
|
23 | | -Grafana is used to visualize various metrics provided by the monitoring system as well as alerts and logs, take a look at the [Grafana](https://grafana.com/) documentation for more information |
| 26 | +Install Prometheus, which includes: |
24 | 27 |
|
25 | | -## Install the metric exporters and pushgateway |
| 28 | +- **Prometheus Operator** - Manages the Prometheus cluster deployment lifecycle |
| 29 | +- **Prometheus Server** - Collects and stores metrics from configured targets |
| 30 | +- **AlertManager** - Handles alerts sent by Prometheus and routes them to notification channels (email, PagerDuty, Slack, etc.) |
| 31 | +- **Node Exporter** - Collects hardware and OS-level metrics from cluster nodes |
| 32 | +- **Kube State Metrics** - Exposes Kubernetes cluster state metrics |
26 | 33 |
|
27 | | -Now let's deploy our exporters and pushgateway! |
| 34 | +See the [Prometheus installation guide](prometheus.md) for detailed setup instructions. |
28 | 35 |
|
29 | | -* [Mysql Exporter](prometheus-mysql-exporter.md) |
30 | | -* [RabbitMQ Exporter](prometheus-rabbitmq-exporter.md) |
31 | | -* [Postgres Exporter](prometheus-postgres-exporter.md) |
32 | | -* [Memcached Exporter](prometheus-memcached-exporter.md) |
33 | | -* [Openstack Exporter](prometheus-openstack-metrics-exporter.md) |
34 | | -* [Pushgateway](prometheus-pushgateway.md) |
| 36 | +## Step 2: Install Grafana |
35 | 37 |
|
36 | | -## Next steps |
| 38 | +Grafana provides visualization dashboards for your metrics, alerts, and logs. |
37 | 39 |
|
38 | | -### Configure alert manager |
| 40 | +Install Grafana to: |
39 | 41 |
|
40 | | -Configure the alert manager to send the specified alerts to slack as an example, see: [Slack Alerts](alertmanager-slack.md) |
| 42 | +- Create custom dashboards for monitoring OpenStack services |
| 43 | +- Visualize metrics collected by Prometheus |
| 44 | +- Set up alert notifications and integrations |
| 45 | +- Analyze logs and trace data |
41 | 46 |
|
42 | | -... and more ... |
| 47 | +For more information about Grafana's capabilities, visit the [Grafana](grafana.md). |
43 | 48 |
|
44 | | -### Update alerting rules |
| 49 | +## Step 3: Deploy Service-Specific Metric Exporters |
45 | 50 |
|
46 | | -Within the genestack repo we can update our custom alerting rules via the alerting_rules.yaml to fit our needs |
| 51 | +With the core monitoring stack in place, deploy exporters to collect metrics from your OpenStack services and infrastructure components. All exporters are available for easy deployment. |
47 | 52 |
|
48 | | -View alerting_rules.yaml in: |
| 53 | +## Step 4: Configure AlertManager |
49 | 54 |
|
50 | | -``` shell |
| 55 | +Configure AlertManager to send notifications when alerts are triggered. Available integrations include: |
| 56 | + |
| 57 | +- [Slack Alerts](alertmanager-slack.md) - Send alerts to Slack channels |
| 58 | +- Email notifications |
| 59 | +- PagerDuty integration |
| 60 | +- Webhook receivers |
| 61 | + |
| 62 | +## Step 5: Customize Alerting Rules |
| 63 | + |
| 64 | +### Custom Alerting Rules |
| 65 | + |
| 66 | +Genestack includes default alerting rules that can be customized for your environment. To view or modify the custom rules: |
| 67 | + |
| 68 | +```shell |
51 | 69 | less /etc/genestack/helm-configs/prometheus/alerting_rules.yaml |
52 | 70 | ``` |
53 | 71 |
|
54 | | -However, many opreators comes with ServiceMonitor and PodMonitor services. These services expose, scrape endpoints |
55 | | -out of the box. These operators will also provide alerting rules curated for the specific service. See specific |
56 | | -service install for any monitoring rules. Example: [RabbitMQ Operator Monitoring](infrastructure-rabbitmq.md#rabbitmq-operator-monitoring) |
| 72 | +Edit this file to add, modify, or remove alerting rules based on your operational requirements. |
| 73 | + |
| 74 | +### Operator-Provided Alerting Rules |
| 75 | + |
| 76 | +Many Genestack operators come with built-in ServiceMonitor and PodMonitor resources that automatically: |
| 77 | + |
| 78 | +- Expose scrape endpoints for metrics collection |
| 79 | +- Provide pre-configured alerting rules tailored to the specific service |
| 80 | + |
| 81 | +These operator-managed rules are curated for best practices and don't require manual configuration. For service-specific monitoring details, refer to the individual service documentation. For example: [RabbitMQ Operator Monitoring](infrastructure-rabbitmq.md#rabbitmq-operator-monitoring). |
| 82 | + |
| 83 | +## Next Steps |
| 84 | + |
| 85 | +Once your monitoring stack is deployed: |
| 86 | + |
| 87 | +1. **Access Grafana** - Log in to Grafana and explore the pre-built dashboards |
| 88 | +2. **Verify Metrics Collection** - Check that Prometheus is successfully scraping all targets |
| 89 | +3. **Test Alerting** - Trigger a test alert to verify AlertManager configuration |
| 90 | +4. **Create Custom Dashboards** - Build dashboards specific to your operational needs |
| 91 | +5. **Tune Alert Thresholds** - Adjust alerting rules based on your environment's baseline behavior |
0 commit comments