Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 83 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,86 @@

This repository collects graphana dashboards created to support DAQ operations and testing.

| folder | description |
| -- | --- |
| overiew | DAQ system overview dashboards |
| timing | Timing system, HSI and timing in general |
| data collection | Event building, data streaming, ... |
| folder | description |
| -- | --- |
| dashboard | DUNE DAQ system overview dashboards |
| external-dashboards | Dashboard to monitor infrastructure tools|

# Info for developers
The most important documentation to develop DUNE DAQ dashboards are
- [grafana documentation](https://grafana.com/docs/)
- [InfluxDB documentation](https://docs.influxdata.com/influxdb/v1/)

They respectively cover the graphic tools and the database that contains the data.
This is assumed to be known in the following.

Yet, in addition, it's necessary to know how the database is structured to be able to form the desired queries.
This documentation covers the particulars of the system we are monitoring.

## Common variables of the dashboards
Each dasbhoard has a number of standard variables used everywhere.
- influxdb
- session

The first one is an automatic way to pick up the right data source.
The second one is an easy way to select data only from the right session.

### Data source variable
This variable is always hidden and the query can be the same for every dashboard.

### Session variable
Most dashboards are describing content that only makes sense within a session.
Hence, these dashboards need a variable called session.
It's important that the variable name is the same as other dashboard (case sensitive) in this way you can automatically select the same variable when opening a dashboard from the main entry one.
An example of query is
```
SELECT "session" FROM (SELECT "state","session" FROM "dunedaq.appfwk.opmon.AppInfo" WHERE $timeFilter)
```
But, it's good practice to change the `FROM` block so that the session is taken from a measurement that is actually used in the dashboard.
Please note that since the session is stored in InfluxDB as a tag, the right way to extract the `session` correctly is via a nested query.

### Other common practices for variables
For measurements that are published by more that one object in a session, we tend to use a common pattern.
The patterns is
- having multi-value variables to allow a selection of the sources
- the queries are filtering the data via the variables defined in the previous step.
Examples of this are application and DLH in the readout dashboard.

## Data structure
`opmonlib` describes the way we publish data.
Behind the scene, `opmonlib` turns protobuf objects into `OpMonEntry`s and a [microservice](https://github.com/DUNE-DAQ/microservices/tree/develop/opmon-protobuf-dbwriter) transforms them into InfluxDB measurements.
Understanding the mapping is the key ingredient to write effective queries.

### Measurement
The name of the measurement, a.k.a. the content of the `FROM` part of the query is the name of the protobuf message, including it's namespace, which is defined in the `package` line of the protobuf.

Example:
[DataWriterInfo](https://github.com/DUNE-DAQ/dfmodules/blob/52ec3406ad8957cc4d921844cb68e3729e03b1ac/schema/dfmodules/opmon/DataWriter.proto#L5) objects are turned into `dunedaq.dfmodules.opmon.DataWriterInfo` measurements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link gets me to:

Image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot reproduce this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And again, I would much prefer explicitly showing the example here than linking to pieces of code that are dynamic and subject to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These links are permanent


### Fields
The fields of a measurements are created by the content of the message.
They are the quantities you can `SELECT` from the database.

### Tags
Tags are important features of InfluxDB that allow us to organise the data in the database.
Through the way DUNE DAQ publishes data, we define a set of tags; some are common for every measurement and some are specific for the measurement.

#### Default tags
Every measurement in the database has a set of standard tags.
Two are mandatory: `session` and `application`.
Then we have `element`, `subelement`, `subsubelement`, etc. which reflect the nested structure that monitoring objects can have in a C++ environment.
Notable examples:
- DAQModules names are stored in `element`
- Queues and connections are stored in `subelement`

#### Custom tags
Lines like this
```C++
publish(std::move(td_info), { { "type", name } });
```
will allow to define custom tags.
In this case the measurement has an additional type called `type` whose value is whatever is the content of "name".

### Entries from python environment
Metrics from drunc are unlikely to have a substructure that goes beyond `session` and `application` as there are DAQModules defined in python.
But it's common to have metrics with custom tags.