You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### AMS ingest metric data (and store them to HDFS and/or Hbase)
7
7
8
-
Flink job that connects as a subscriber to an ARGO Messaging Service, pulls messages from a specific project/subscription and stores them to a remote hbase cluster.
8
+
Flink job that connects as a subscriber to an ARGO Messaging Service, pulls messages from a specific project/subscription and stores them to a remote hdfs and/or hbase cluster.
`--ams-endpoint` : ARGO messaging api endoint to connect to msg.example.com
22
+
`--ams.endpoint` : ARGO messaging api endoint to connect to msg.example.com
23
+
24
+
`--ams.port` : ARGO messaging api port
25
+
26
+
`--ams.token` : ARGO messaging api token
27
+
28
+
`--ams.project` : ARGO messaging api project to connect to
29
+
30
+
`--ams.sub` : ARGO messaging subscription to pull from
23
31
24
-
`--ams-port` : ARGO messaging api port
32
+
Job optional cli parameters:
25
33
26
-
`--ams-token` : ARGO messaging api token
34
+
`--hbase.master`: hbase endpoint
27
35
28
-
`--ams-project` : ARGO messaging api project to connect to
36
+
`--hbase.master.port` : hbase master port
29
37
30
-
`--ams-sub` : ARGO messaging subscription to pull from
38
+
`--hbase.zk.quorum`: comma separated list of hbase zookeeper servers
31
39
32
-
`--hbase-master` : hbase endpoint
40
+
`--hbase.zk.port`: port used by hbase zookeeper servers
33
41
34
-
`--hbase-master-port` : hbase master port
42
+
`--hbase.namespace` : table namespace used (usually tenant name)
35
43
36
-
`--hbase-zk-quorum`: comma separated list of hbase zookeeper servers
44
+
`--hbase.table` : table name (usually metric_data)
37
45
38
-
`--hbase-zk-port`: port used by hbase zookeeper servers
46
+
`--hdfs.path` : base path for storing metric data on hdfs
39
47
40
-
`--hbase-namespace`: table namespace used (usually tenant name)
48
+
`--check.path` : path to store flink checkpoints
41
49
42
-
`--hbase-table` : table name (usually metric_data)
50
+
`--check.interval` : interval for checkpointing (in ms)
51
+
52
+
`--ams.batch` : num of messages to be retrieved per request to AMS service
53
+
54
+
`--ams.interval` : interval (in ms) between AMS service requests
55
+
56
+
`--ams.proxy` : optional http proxy url to be used for AMS requests
57
+
58
+
`--ams.verify` : optional turn on/off ssl verify
43
59
44
60
### Metric data hbase schema
45
61
@@ -71,21 +87,23 @@ Each hbase table has a column family named 'data' and the following columns:
71
87
72
88
`tags` : json list of tags used to add metadata to the metric event
73
89
74
-
### Stream Status
75
90
76
-
Flink job that connects as a subscriber to an ARGO Messaging Service, pulls messages from a specific project/subscription.
77
-
For each metric data message the job calculates status changes in the whole topology and produces status Events.
78
-
The status events are then forwarded to a specific kafka topic
91
+
### AMS ingest connector (sync) data to HDFS
92
+
93
+
Flink job that connects as a subscriber to an ARGO Messaging Service, pulls messages that contain connector (sync) data (metric profiles, topology, weight etc.) from a specific project/subscription and stores them to an hdfs destination. Each message should have the following attributes:
94
+
- report: name of the report that the connector data belong to
95
+
- type: type of the connector data (metric_profile, group_endpoints, group_groups, weights, downtimes)
96
+
- partition_date: YYYY-MM-DD format of date that the current connector data relates to.
`--downtimes` : file location of downtimes file (local or hdfs)
221
312
222
-
`--report`: report uuid
313
+
`--conf` : file location of report configuration json file (local or hdfs)
223
314
224
315
`--run.date` : target date in DD-MM-YYYY format
225
316
226
-
`--egroup.type` : endpoint group type used in report (for e.g. SITES)
317
+
`--mongo.uri` : MongoDB uri for outputting the results to (e.g. mongodb://localhost:21017/example_db)
318
+
319
+
`--mongo.method` : MongoDB method to be used when storing the results ~ either: `insert` or `upsert`
320
+
321
+
322
+
## Flink job names
323
+
Running flink jobs can be listed either in flink dashboard by visiting `http://{{flink.webui.host}}:{{flink.webui.port}}`
324
+
or by quering jobmanager api at `http://{{flink.webui.host}:{{flink.webui.port}}/joboverview/running
227
325
228
-
`--ggroup.type` : group of groups type used in report (for e.g. NGI)
326
+
Each job submitted has a discerning job name based on a specific template. Job names are used also by submission wrapper scripts (`/bin` folder) to check if a identical job runs (to avoid duplicate submission)
229
327
230
-
`--datastore.uri` : datastore uri for outputting the results
328
+
Job Name schemes:
329
+
Job Type| Job Name scheme
330
+
--------|----------------
331
+
Ingest Metric | Ingesting metric data from `{{ams-endpoint}}`/v1/projects/`{{project}}`/subscriptions/`{{subscription}}`
332
+
Ingest Sync | Ingesting sync data from `{{ams-endpoint}}`/v1/projects/`{{project}}`/subscriptions/`{{subscription}}`
333
+
Batch AR | Ar Batch job for tenant:`{{tenant}}` on day:`{{day}}` using report:`{{report}}`
334
+
Batch Status | Status Batch job for tenant:`{{tenant}}` on day:`{{day}}` using report:`{{report}}`
335
+
Streaming Status | Streaming status using data from `{{ams-endpoint}}`/v1/projects/`{{project}}`/subscriptions/`[`{{metric_subscription}}`,`{{sync_subscription}}`]
0 commit comments