Skip to content

Commit 260c315

Browse files
authored
Add development documents for metrics (#2955)
* Add metrics docs * Add metrics chinese docs
1 parent 9cfcc94 commit 260c315

File tree

3 files changed

+205
-0
lines changed

3 files changed

+205
-0
lines changed

docs/source/development/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ Development
1010
operand
1111
oscar/index
1212
services/index
13+
metrics
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
.. _metrics:
2+
3+
Metrics
4+
====================
5+
6+
Mars has a unified metrics API and three different backends.
7+
8+
A Unified Metrics API
9+
------------------
10+
11+
Mars metrics API are in ``mars/metrics/api.py`` and there are four metric types:
12+
13+
* ``Counter`` is a cumulative type of data which represents a monotonically increasing number.
14+
* ``Gauge`` is a single numerical value.
15+
* ``Meter`` is the rate at which a set of events occur. we can use it as qps or tps.
16+
* ``Histogram`` is a type of statistics which records the average value of a window data.
17+
18+
And we can use these types as follows:
19+
20+
.. code-block:: python
21+
22+
# Four metrics have a unified parameter list:
23+
# 1. Declarative method: Metrics.counter(name: str, description: str = "", tag_keys: Optional[Tuple[str]] = None)
24+
# 2. Record method: record(value=1, tags: Optional[Dict[str, str]] = None)
25+
26+
c1 = Metrics.counter('counter1', 'A counter')
27+
c1.record(1)
28+
29+
c2 = Metrics.counter('counter2', 'A counter', ('service', 'tenant'))
30+
c2.record(1, {'service': 'mars', 'tenant': 'test'})
31+
32+
g1 = Metrics.gauge('gauge1')
33+
g1.record(1)
34+
35+
g2 = Metrics.gauge('gauge2', 'A gauge', ('service', 'tenant'))
36+
g2.record(1, {'service': 'mars', 'tenant': 'test'})
37+
38+
m1 = Metrics.meter('meter1')
39+
m1.record(1)
40+
41+
m2 = Metrics.meter('meter1', 'A meter', ('service', 'tenant'))
42+
m2.record(1, {'service': 'mars', 'tenant': 'test'})
43+
44+
h1 = Metrics.histogram('histogram1')
45+
h1.record(1)
46+
47+
h2 = Metrics.histogram('histogram1', 'A histogram', ('service', 'tenant')))
48+
h2.record(1, {'service': 'mars', 'tenant': 'test'})
49+
50+
**Note**: If ``tag_keys`` is declared, ``tags`` must be specified when invoking
51+
``record`` method and tags' keys must be consistent with ``tag_keys``.
52+
53+
Three different Backends
54+
------------------
55+
56+
Mars metrics support three different backends:
57+
58+
* ``console`` is used for debug and it just prints the value.
59+
* ``prometheus`` is an open-source systems monitoring and alerting toolkit.
60+
* ``ray`` is a metric backend which just runs on ray engine.
61+
62+
We can choose a metric backend by configuring ``metrics.backend`` in
63+
``mars/deploy/oscar/base_config.yml`` or its descendant files.
64+
65+
Metrics Naming Convention
66+
------------------
67+
68+
We propose a naming convention for metrics as follows:
69+
70+
``namespace.[component].metric_name[_units]``
71+
72+
* ``namespace`` could be ``mars``.
73+
* ``component`` could be `supervisor`, `worker` or `band` etc, and can be omitted.
74+
* ``units`` is the metric unit which may be seconds when recording time, or
75+
``_count`` when metric type is ``Counter``, ``_number`` when metric type is
76+
``Gauge`` if there is no suitable unit.
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# SOME DESCRIPTIVE TITLE.
2+
# Copyright (C) 1999-2020, The Alibaba Group Holding Ltd.
3+
# This file is distributed under the same license as the mars package.
4+
# FIRST AUTHOR <EMAIL@ADDRESS>, 2022.
5+
#
6+
#, fuzzy
7+
msgid ""
8+
msgstr ""
9+
"Project-Id-Version: mars 0.9.0rc2+18.g21929ced5\n"
10+
"Report-Msgid-Bugs-To: \n"
11+
"POT-Creation-Date: 2022-04-24 12:19+0800\n"
12+
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
13+
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
14+
"Language-Team: LANGUAGE <[email protected]>\n"
15+
"MIME-Version: 1.0\n"
16+
"Content-Type: text/plain; charset=utf-8\n"
17+
"Content-Transfer-Encoding: 8bit\n"
18+
"Generated-By: Babel 2.10.1\n"
19+
20+
#: ../../source/development/metrics.rst:4
21+
msgid "Metrics"
22+
msgstr ""
23+
24+
#: ../../source/development/metrics.rst:6
25+
msgid "Mars has a unified metrics API and three different backends."
26+
msgstr "Mars 有一个统一的 metrics API 和三个不同的后端。"
27+
28+
#: ../../source/development/metrics.rst:9
29+
msgid "A Unified Metrics API"
30+
msgstr "统一的 Metrics API"
31+
32+
#: ../../source/development/metrics.rst:11
33+
msgid ""
34+
"Mars metrics API are in ``mars/metrics/api.py`` and there are four metric"
35+
" types:"
36+
msgstr "Mars metrics API 在 ``mars/metrics/api.py``,有四种 metric 类型:"
37+
38+
#: ../../source/development/metrics.rst:13
39+
msgid ""
40+
"``Counter`` is a cumulative type of data which represents a monotonically"
41+
" increasing number."
42+
msgstr "``Counter`` 是一种累积类型的数据,代表一个单调递增的数字。"
43+
44+
#: ../../source/development/metrics.rst:14
45+
msgid "``Gauge`` is a single numerical value."
46+
msgstr "``Gauge`` 是一个单一的数值。"
47+
48+
#: ../../source/development/metrics.rst:15
49+
msgid ""
50+
"``Meter`` is the rate at which a set of events occur. we can use it as "
51+
"qps or tps."
52+
msgstr "``Meter`` 是一组事件发生的速率。 我们可以将其用作 qps 或 tps。"
53+
54+
#: ../../source/development/metrics.rst:16
55+
msgid ""
56+
"``Histogram`` is a type of statistics which records the average value of"
57+
" a window data."
58+
msgstr "``Histogram`` 是一种统计类型,它记录窗口数据的平均值。"
59+
60+
#: ../../source/development/metrics.rst:18
61+
msgid "And we can use these types as follows:"
62+
msgstr "我们可以如下使用这几种 metrics:"
63+
64+
#: ../../source/development/metrics.rst:50
65+
msgid ""
66+
"**Note**: If ``tag_keys`` is declared, ``tags`` must be specified when "
67+
"invoking ``record`` method and tags' keys must be consistent with "
68+
"``tag_keys``."
69+
msgstr "**注意**:如果声明了 ``tag_keys``,调用 ``record`` 方法时必须指定 ``tags`` "
70+
"参数,并且 ``tags`` 的 keys 必须跟 ``tag_keys`` 保持一致。"
71+
72+
#: ../../source/development/metrics.rst:54
73+
msgid "Three different Backends"
74+
msgstr "三种不同的后端"
75+
76+
#: ../../source/development/metrics.rst:56
77+
msgid "Mars metrics support three different backends:"
78+
msgstr "Mars metrics 支持3种不同的后端:"
79+
80+
#: ../../source/development/metrics.rst:58
81+
msgid "``console`` is used for debug and it just prints the value."
82+
msgstr "``console`` 是用来调试的,只打印出 metric 值。"
83+
84+
#: ../../source/development/metrics.rst:59
85+
msgid "``prometheus`` is an open-source systems monitoring and alerting toolkit."
86+
msgstr "``prometheus`` 一个开源系统监控和报警工具包。"
87+
88+
#: ../../source/development/metrics.rst:60
89+
msgid "``ray`` is a metric backend which just runs on ray engine."
90+
msgstr "``ray`` 是一种运行在 ray 引擎上的 metric 后端。"
91+
92+
#: ../../source/development/metrics.rst:62
93+
msgid ""
94+
"We can choose a metric backend by configuring ``metrics.backend`` in "
95+
"``mars/deploy/oscar/base_config.yml`` or its descendant files."
96+
msgstr "我们可以通过配置 ``mars/deploy/oscar/base_config.yml`` 或它的继承文件中的 "
97+
"``metrics.backend`` 来选择一种 metric 后端。"
98+
99+
#: ../../source/development/metrics.rst:66
100+
msgid "Metrics Naming Convention"
101+
msgstr "Metrics 命名约定"
102+
103+
#: ../../source/development/metrics.rst:68
104+
msgid "We propose a naming convention for metrics as follows:"
105+
msgstr "我们提出一种如下的 metrics 命名约定:"
106+
107+
#: ../../source/development/metrics.rst:70
108+
msgid "``namespace.[component].metric_name[_units]``"
109+
msgstr ""
110+
111+
#: ../../source/development/metrics.rst:72
112+
msgid "``namespace`` could be ``mars``."
113+
msgstr "``namespace`` 可以是 ``mars``。"
114+
115+
#: ../../source/development/metrics.rst:73
116+
msgid "``component`` could be `supervisor`, `worker` or `band` etc, and can be "
117+
"omitted."
118+
msgstr "``component`` 可以是 `supervisor`,`worker` 或 `band` 等等,也可以省略这个参数。"
119+
120+
#: ../../source/development/metrics.rst:74
121+
msgid ""
122+
"``units`` is the metric unit which may be seconds when recording time, or"
123+
" ``_count`` when metric type is ``Counter``, ``_number`` when metric type"
124+
" is ``Gauge`` if there is no suitable unit."
125+
msgstr "``units`` 是 metric 的单位,当记录的是时间时,可以用 seconds,当没有合适的单位"
126+
"时,``Counter`` 类型的 metric 可以用 ``_count``,``Gauge`` 类型的 metric 可以用 "
127+
"``_number``。"
128+

0 commit comments

Comments
 (0)