You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+33Lines changed: 33 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,6 +57,10 @@ Job optional cli parameters:
57
57
58
58
`--ams.verify` : optional turn on/off ssl verify
59
59
60
+
### Restart strategy
61
+
Job has a fixed delay restart strategy. If it fails it will try to restart for a maximum of 10 attempt with a retry interval of 2 minutes
62
+
between each attempt
63
+
60
64
### Metric data hbase schema
61
65
62
66
Metric data are stored in hbase tables using different namespaces for different tenants (e.g. hbase table name = '{TENANT_name}:metric_data')
@@ -127,6 +131,9 @@ Job required cli parameters:
127
131
128
132
`--ams.verify` : optional turn on/off ssl verify
129
133
134
+
### Restart strategy
135
+
Job has a fixed delay restart strategy. If it fails it will try to restart for a maximum of 10 attempt with a retry interval of 2 minutes
136
+
between each attempt
130
137
131
138
### Stream Status
132
139
@@ -210,6 +217,9 @@ Other optional cli parameters
210
217
211
218
`--ams.verify` : optional turn on/off ssl verify
212
219
220
+
### Restart strategy
221
+
Job has a fixed delay restart strategy. If it fails it will try to restart for a maximum of 10 attempt with a retry interval of 2 minutes
222
+
between each attempt
213
223
214
224
215
225
### Status events schema
@@ -233,6 +243,25 @@ Status events are generated as JSON messages that are defined by the following c
233
243
A metric data message can produce zero, one or more status metric events. The system analyzes the new status introduced by the metric and then aggregates on top levels to see if any other status changes are produced.
234
244
If a status of an item actually changes an appropriate status event is produced based on the item type (endpoint_group,service,endpoint,metric).
235
245
246
+
## Threshold rule files
247
+
Each report can be accompanied by a threshold rules file which includes rules on low level metric data which may accompany a monitoring message with the field 'actual_data'.
248
+
The rule file is in JSON format and has the following schema:
Each rule has multiple thresholds separated by whitespace. Each threshold has the following format:
262
+
`firstlabel=10s;30;50:60;0;100` which corresponds to `{{label}}={{value}}{{uom}};{{warning-range}};{{critical-range}};{{min}};{{max}}`. Each range is in the form of`{{floor}}:{{ceiling}}` but some shortcuts can be taken in declarations.
263
+
264
+
236
265
## Batch Status
237
266
238
267
Flink batch job that calculates status results for a specific date
@@ -273,6 +302,8 @@ Job required cli parameters:
273
302
274
303
`--mongo.method` : MongoDB method to be used when storing the results ~ either: `insert` or `upsert`
275
304
305
+
`--thr` : (optional) file location of threshold rules
306
+
276
307
277
308
## Batch AR
278
309
@@ -318,6 +349,8 @@ Job required cli parameters:
318
349
319
350
`--mongo.method` : MongoDB method to be used when storing the results ~ either: `insert` or `upsert`
320
351
352
+
`--thr` : (optional) file location of threshold rules
353
+
321
354
322
355
## Flink job names
323
356
Running flink jobs can be listed either in flink dashboard by visiting `http://{{flink.webui.host}}:{{flink.webui.port}}`
0 commit comments