-
Notifications
You must be signed in to change notification settings - Fork 116
Alarm Logging
The alarm logger forwards status and configuration updates from one or more alarm servers to ElasticSearch. Kibana is a web-based tool for viewing and analyzing data in ElasticSearch. It can for example provide a list of recent alarms, or show the top-10 alarms of the last 24 hours.
For basic setup of the alarm server see https://github.com/ControlSystemStudio/phoebus/tree/master/app/alarm
For the alarm logger see https://github.com/ControlSystemStudio/phoebus/tree/master/services/alarm-logger
When simply downloading and starting Kibana, it will connect to elastic on localhost and allow web access to its GUI on localhost. If that is insufficient and remote access is required, edit kibana/config/kibana.yml:
# Default value "localhost" will only allow local access.
# Open to remote access:
server.host="0.0.0.0"
Open web browser to http://localhost:5601 (unless remote access was enabled as shown above).
The raw elastic data is in indices named xxx_alarms_state_2021-11-01
with new indices created each month or week depending on logger settings.
Kibana uses index patters to combine data from for example all *_alarms_state_* indices.
If you have multiple alarm setups, you can combine data from all alarm systems that way.
After a restart with empty ElasticSearch data, when first using the "Discover" link, an index pattern with strange ID is auto-created. When then creating visualizations, they cannot be exported/imported without adjusting the strange IDs.
To avoid problems, create index names with known IDs:
- Kibana Management, Index Patterns, Create index pattern:
- Index Pattern:
*alarms_state*, Next - Time filter field: Select message_time.
- Under "show advanced settings", enter ID "alarms_state"
If you have multiple alarm setups, you can decide to either combine
all indices via a *alarms_state* pattern, or use a pattern xxx_alarms_state*
to combine only messages for setup xxx.
In the following we assume that the alarm logger, elastic and kibana are all running as Linux systemd services which can be checked like this:
systemctl status elasticsearch
systemctl status alarm-logger
systemctl status kibana
To start over, stop all services:
sudo systemctl stop kibana
sudo systemctl stop alarm-logger
sudo systemctl stop elasticsearch
Start back up, verify that it's running:
sudo systemctl start elasticsearch
lynx -dump http://localhost:9200
lynx -dump http://localhost:9200/_cat/indices?v
Start logger:
sudo systemctl start alarm-logger
After a short while check that it's running and knows the alarm templates:
netstat -an | fgrep 9200
curl http://localhost:9200
curl http://localhost:9200/_template/*alarm*?pretty
If the alarm logger has seen any alarm traffic, it should have created some `alarm indices:
curl http://localhost:9200/_cat/indices?v
Pick a specific index and dump its data
curl 'http://localhost:9200/XXX_alarms_config_2019-04-26/_search?format=json&pretty'
To check log messages, for example to see if kibana complains about an incompatible version of elasticsearch:
sudo journalctl -u kibana
Open web browser to http://localhost:5601 (unless remote access was enabled as shown above).
Basic list of recent alarms:
- From top-left menu, select Kibana, Discover
- Select the alarms_state index pattern
- From available fields, select config, current_severity, severity
- Select desired time range in upper right corner
To then narrow the alarm listing to just one PV:
- "Add filter"
- Select "pv" field
- Select "is" operator
- Select one of the suggest PVs for a value
Plot of Top Alarm Trigger PVs:
- Visualize, add a "Vertical Bar" graph
- Select the *alarms_state_ data source
- Add filter: current_severity is one of MINOR, MAJOR, INVALID, UNDEFINED
- Data Metrics: Y-Axis Aggregation "Count", label "Alarm Count"
- Data Buckets: X-Axis Aggregation "Terms", field "pv", Order by "Metric: Alarm Count", order "Descending", size "10", label "Alarm PV"
- Under "Metrix & axes", change X-axis "Align" to "Angled"
- Save as "Top Alarm Trigger PVs"
Plot of Top Latched Alarms:
- Same as Top Alarm Trigger PVs, but add a filter where "latch" is "true"
Dashboard:
- Create dashboard, add/position visualizations
- Select time range 24 hours
- Save with "Store time with dashboard" selected
To export/import, use Kibana Management, Saved Objects, select all visualizations and dashboards to export, or import a previously exported file.
Older indices need to be deleted to improve Kibana response time and to save disk space. This can be done via Kibana, Management, Stack Management, Data, Index Management, or via the REST interface:
curl http://localhost:9200/_cat/indices?v | fgrep xxx_alarms | sort
curl -X DELETE "localhost:9200/xxx_alarms_state_2019-07-*"
curl -X DELETE "localhost:9200/xxx_alarms_cmd_2019-07-*"
curl -X DELETE "localhost:9200/xxx_alarms_config_2019-07-*"
lynx -dump http://localhost:9200/_cat/indices?v | fgrep xxx_alarms | sort