@@ -7,12 +7,12 @@ Operations and Monitoring
7
7
Access to Kibana
8
8
================
9
9
10
- OpenStack control plane logs are aggregated from all servers by Monasca and
10
+ OpenStack control plane logs are aggregated from all servers by Fluentd and
11
11
stored in ElasticSearch. The control plane logs can be accessed from
12
12
ElasticSearch using Kibana, which is available at the following URL:
13
13
|kibana_url |
14
14
15
- To login , use the ``kibana `` user. The password is auto-generated by
15
+ To log in , use the ``kibana `` user. The password is auto-generated by
16
16
Kolla-Ansible and can be extracted from the encrypted passwords file
17
17
(|kolla_passwords |):
18
18
@@ -24,19 +24,32 @@ Kolla-Ansible and can be extracted from the encrypted passwords file
24
24
Access to Grafana
25
25
=================
26
26
27
- Monasca metrics can be visualised in Grafana dashboards. Monasca Grafana can be
27
+ Control plane metrics can be visualised in Grafana dashboards. Grafana can be
28
28
found at the following address: |grafana_url |
29
29
30
- Grafana uses Keystone authentication. To login, use valid OpenStack user
31
- credentials.
30
+ To log in, use the |grafana_username | user. The password is auto-generated by
31
+ Kolla-Ansible and can be extracted from the encrypted passwords file
32
+ (|kolla_passwords |):
33
+
34
+ .. code-block :: console
35
+ :substitutions:
36
+
37
+ kayobe# ansible-vault view ${KAYOBE_CONFIG_PATH}/kolla/passwords.yml --vault-password-file |vault_password_file_path| | grep ^grafana_admin_password
38
+
39
+ Access to Prometheus Alertmanager
40
+ =================================
32
41
33
- To visualise control plane metrics, you will need one of the following roles in
34
- the `` monasca_control_plane `` project:
42
+ Control plane alerts can be visualised and managed in Alertmanager, which can
43
+ be found at the following address: | alertmanager_url |
35
44
36
- * ``admin ``
37
- * ``monasca-user ``
38
- * ``monasca-read-only-user ``
39
- * ``monasca-editor ``
45
+ To log in, use the ``admin `` user. The password is auto-generated by
46
+ Kolla-Ansible and can be extracted from the encrypted passwords file
47
+ (|kolla_passwords |):
48
+
49
+ .. code-block :: console
50
+ :substitutions:
51
+
52
+ kayobe# ansible-vault view ${KAYOBE_CONFIG_PATH}/kolla/passwords.yml --vault-password-file |vault_password_file_path| | grep ^prometheus_alertmanager_password
40
53
41
54
Migrating virtual machines
42
55
==========================
@@ -246,6 +259,7 @@ Monitoring
246
259
247
260
* `Back up InfluxDB <https://docs.influxdata.com/influxdb/v1.8/administration/backup_and_restore/ >`__
248
261
* `Back up ElasticSearch <https://www.elastic.co/guide/en/elasticsearch/reference/current/backup-cluster-data.html >`__
262
+ * `Back up Prometheus <https://prometheus.io/docs/prometheus/latest/querying/api/#snapshot >`__
249
263
250
264
Seed
251
265
----
@@ -260,137 +274,21 @@ Ansible control host
260
274
Control Plane Monitoring
261
275
========================
262
276
263
- Monasca has been configured to collect logs and metrics across the control
264
- plane. It provides a single point where control plane monitoring and telemetry
265
- data can be analysed and correlated.
266
-
267
- Metrics are collected per server via the `Monasca Agent
268
- <https://opendev.org/openstack/monasca-agent> `__. The Monasca Agent is deployed
269
- and configured by Kolla Ansible.
270
-
271
- Logging to Monasca is done via a `Fluentd output plugin
272
- <https://github.com/monasca/fluentd-monasca> `__.
273
-
274
- Configuring Monasca Alerts
275
- --------------------------
276
-
277
- Generating Metrics from Specific Log Messages
278
- +++++++++++++++++++++++++++++++++++++++++++++
279
-
280
- If you wish to generate alerts for specific log messages, you must first
281
- generate metrics from those log messages. Metrics are generated from the
282
- transformed logs queue in Kafka. The Monasca log metrics service reads log
283
- messages from this queue, transforms them into metrics and then writes them to
284
- the metrics queue.
285
-
286
- The rules which govern this transformation are defined in the logstash config
287
- file. This file can be configured via kayobe. To do this, edit
288
- ``etc/kayobe/kolla/config/monasca/log-metrics.conf ``, for example:
289
-
290
- .. code-block :: text
291
-
292
- # Create events from specific log signatures
293
- filter {
294
- if "Another thread already created a resource provider" in [log][message] {
295
- mutate {
296
- add_field => { "[log][dimensions][event]" => "hat" }
297
- }
298
- } else if "My string here" in [log][message] {
299
- mutate {
300
- add_field => { "[log][dimensions][event]" => "my_new_alert" }
301
- }
302
- }
303
-
304
- Reconfigure Monasca:
305
-
306
- .. code-block :: text
307
-
308
- kayobe# kayobe overcloud service reconfigure --kolla-tags monasca
309
-
310
- Verify that logstash doesn't complain about your modification. On each node
311
- running the ``monasca-log-metrics `` service, the logs can be inspected in the
312
- Kolla logs directory, under the ``logstash `` folder:
313
- ``/var/log/kolla/logstash ``.
314
-
315
- Metrics will now be generated from the configured log messages. To generate
316
- alerts/notifications from your new metric, follow the next section.
317
-
318
- Generating Monasca Alerts from Metrics
319
- ++++++++++++++++++++++++++++++++++++++
277
+ The control plane has been configured to collect logs centrally using the EFK
278
+ stack (Elasticsearch, Fluentd and Kibana).
320
279
321
- Firstly, we will configure alarms and notifications. This should be done via
322
- the Monasca client. More detailed documentation is available in the ` Monasca
323
- API specification
324
- <https://github.com/openstack/monasca-api/blob/master/docs/monasca-api-spec.md#alarm-definitions-and-alarms> `__.
325
- This document provides an overview of common use-cases .
280
+ Telemetry monitoring of the control plane is performed by Prometheus. Metrics
281
+ are collected by Prometheus exporters, which are either running on all hosts
282
+ (e.g. node exporter), on specific hosts (e.g. controllers for the memcached
283
+ exporter or monitoring hosts for the OpenStack exporter). These exporters are
284
+ scraped by the Prometheus server .
326
285
327
- To create a Slack notification, first obtain the URL for the notification hook
328
- from Slack, and configure the notification as follows:
286
+ Configuring Prometheus Alerts
287
+ -----------------------------
329
288
330
- .. code-block :: console
331
-
332
- monasca# monasca notification-create stackhpc_slack SLACK https://hooks.slack.com/services/UUID
333
-
334
- You can view notifications at any time by invoking:
335
-
336
- .. code-block :: console
337
-
338
- monasca# monasca notification-list
339
-
340
- To create an alarm with an associated notification:
341
-
342
- .. code-block :: console
343
-
344
- monasca# monasca alarm-definition-create multiple_nova_compute \
345
- '(count(log.event.multiple_nova_compute{}, deterministic)>0)' \
346
- --description "Multiple nova compute instances detected" \
347
- --severity HIGH --alarm-actions $NOTIFICATION_ID
348
-
349
- By default one alarm will be created for all hosts. This is typically useful
350
- when you are looking at the overall state of some hosts. For example in the
351
- screenshot below the ``db_mon_log_high_mem_usage `` alarm has previously
352
- triggered on a number of hosts, but is currently below threshold.
353
-
354
- If you wish to have an alarm created per host you can use the ``--match-by ``
355
- option and specify the hostname dimension. For example:
356
-
357
- .. code-block :: console
358
-
359
- monasca# monasca alarm-definition-create multiple_nova_compute \
360
- '(count(log.event.multiple_nova_compute{}, deterministic)>0)' \
361
- --description "Multiple nova compute instances detected" \
362
- --severity HIGH --alarm-actions $NOTIFICATION_ID
363
- --match-by hostname
364
-
365
- Creating an alarm per host can be useful when alerting on one off events such
366
- as log messages which need to be actioned individually. Once the issue has been
367
- investigated and fixed, the alarm can be deleted on a per host basis.
368
-
369
- For example, in the case of monitoring for file system corruption one might
370
- define a metric from the system logs alerting on XFS file system corruption, or
371
- ECC memory errors. These metrics may only be generated once, but it is
372
- important that they are not ignored. Therefore, in the example below, the last
373
- operator is used so that the alarm is evaluated against the last metric
374
- associated with the log message. Since for log metrics the value of this metric
375
- is always greater than 0, this alarm can only be reset by deleting it (which
376
- can be accomplished by clicking on the dustbin icon in Monasca Grafana). By
377
- ensuring that the alarm has to be manually deleted and will not reset to the OK
378
- status, important errors can be tracked.
379
-
380
- .. code-block :: console
381
-
382
- monasca# monasca alarm-definition-create xfs_errors \
383
- '(last(log.event.xfs_errors_detected{}, deterministic)>0)' \
384
- --description "XFS errors detected on host" \
385
- --severity HIGH --alarm-actions $NOTIFICATION_ID \
386
- --match-by hostname
387
-
388
- It is also possible to update existing alarms. For example, to update, or add
389
- multiple notifications to an alarm:
390
-
391
- .. code-block :: console
392
-
393
- monasca# monasca alarm-definition-patch $ALARM_ID --alarm-actions $NOTIFICATION_ID --alarm-actions $NOTIFICATION_ID_2
289
+ Alerts are defined in code and stored in Kayobe configuration. See ``*.rules ``
290
+ files in ``${KAYOBE_CONFIG_PATH}/kolla/config/prometheus `` as a model to add
291
+ custom rules.
394
292
395
293
Control Plane Shutdown Procedure
396
294
================================
@@ -683,21 +581,26 @@ perform the following cleanup procedure regularly:
683
581
684
582
Elasticsearch indexes retention
685
583
===============================
686
- To enable and alter default rotation values for Elasticsearch Curator edit ``${KAYOBE_CONFIG_PATH}/kolla/globals.yml `` - This applies both to Monasca and Central Logging configurations.
584
+
585
+ To enable and alter default rotation values for Elasticsearch Curator, edit
586
+ ``${KAYOBE_CONFIG_PATH}/kolla/globals.yml ``:
687
587
688
588
.. code-block :: console
689
589
690
590
# Allow Elasticsearch Curator to apply a retention policy to logs
691
591
enable_elasticsearch_curator: true
592
+
692
593
# Duration after which index is closed
693
594
elasticsearch_curator_soft_retention_period_days: 90
595
+
694
596
# Duration after which index is deleted
695
597
elasticsearch_curator_hard_retention_period_days: 180
696
598
697
- Reconfigure elasticsearch with new values:
599
+ Reconfigure Elasticsearch with new values:
698
600
699
601
.. code-block :: console
700
602
701
- kayobe overcloud service reconfigure --kolla-tags elasticsearch --kolla-skip-tags common --skip-precheck
603
+ kayobe overcloud service reconfigure --kolla-tags elasticsearch
702
604
703
- For more information see `upstream documentation <https://docs.openstack.org/kolla-ansible/ussuri/reference/logging-and-monitoring/central-logging-guide.html#curator >`__
605
+ For more information see the `upstream documentation
606
+ <https://docs.openstack.org/kolla-ansible/latest/reference/logging-and-monitoring/central-logging-guide.html#curator> `__.
0 commit comments