Skip to content

Commit cacf3ab

Browse files
authored
Merge pull request #392 from stackhpc/xena-backports
Xena: backport some bits from Yoga
2 parents cd566b4 + be5e64d commit cacf3ab

File tree

6 files changed

+380
-44
lines changed

6 files changed

+380
-44
lines changed

doc/source/configuration/monitoring.rst

Lines changed: 52 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,20 @@ Monitoring Configuration
77

88
StackHPC kayobe config includes a reference monitoring and alerting stack based
99
on Prometheus, Alertmanager, Grafana, Fluentd, Elasticsearch & Kibana. These
10-
services by default come enabled and configured. Central Elasticsearch cluster
11-
collects OpenStack logs, with an option to receive operating system logs too.
12-
In order to enable this, execute custom playbook after deployment:
10+
services by default come enabled and configured.
11+
12+
Monitoring hosts, usually the controllers, should be added to the monitoring
13+
group. The group definition can be applied in various different places. For
14+
example, this configuration could be added to etc/kayobe/inventory/groups:
15+
16+
.. code-block:: yaml
17+
18+
[monitoring:children]
19+
controllers
20+
21+
Central Elasticsearch cluster collects OpenStack logs, with an option to receive
22+
operating system logs too. In order to enable this, execute custom playbook
23+
after deployment:
1324

1425
.. code-block:: console
1526
@@ -78,3 +89,41 @@ on the overcloud hosts:
7889
7990
SMART reporting should now be enabled along with a Prometheus alert for
8091
unhealthy disks and a Grafana dashboard called ``Hardware Overview``.
92+
93+
Alertmanager and Slack
94+
======================
95+
96+
StackHPC Kayobe configuration comes bundled with an array of alerts but does not
97+
enable any receivers for notifications by default. Various receivers can be
98+
configured for Alertmanager. Slack is currently the most common.
99+
100+
To set up a receiver, create a ``prometheus-alertmanager.yml`` file under
101+
``etc/kayobe/kolla/config/prometheus/``. An example config is stored in this
102+
directory. The example configuration uses two Slack channels. One channel
103+
receives all alerts while the other only receives alerts tagged as critical. It
104+
also adds a silence button to temporarily mute alerts. To use the example in a
105+
deployment, you will need to generate two webhook URLs, one for each channel.
106+
107+
To generate a slack webhook, `create a new app
108+
<https://api.slack.com/apps/new>`__ in the workspace you want to add alerts to.
109+
From the Features page, toggle Activate incoming webhooks on. Click Add new
110+
webhook to workspace. Pick a channel that the app will post to, then click
111+
Authorise. You only need one app to generate both webhooks.
112+
113+
Both URLs should be encrypted using ansible vault, as they give anyone access to
114+
your slack channels. The standard practice is to store them in
115+
``kayobe/secrets.yml`` as:
116+
117+
.. code-block:: yaml
118+
119+
secrets_slack_notification_channel_url: <some_webhook_url>
120+
secrets_slack_critical_notification_channel_url: <some_other_webhook_url>
121+
122+
These should then be set as the ``slack_api_url`` and ``api_url`` for the
123+
regular and critical alerts channels respectively. Both slack channel names will
124+
need to be set, and the proxy URL sould be set or removed.
125+
126+
If you want to add an alerting rule, there are many good examples of alerts are
127+
available `here <https://awesome-prometheus-alerts.grep.to/>`__. They simply
128+
need to be added to one of the ``*.rules`` files in the prometheus configuration
129+
directory.

0 commit comments

Comments
 (0)