Skip to content

Commit b184027

Browse files
sjp38torvalds
authored andcommitted
Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface
This commit adds detailed usage of DAMON sysfs interface in the admin-guide document for DAMON. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Xin Hao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 40184e4 commit b184027

File tree

1 file changed

+344
-6
lines changed

1 file changed

+344
-6
lines changed

Documentation/admin-guide/mm/damon/usage.rst

Lines changed: 344 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
Detailed Usages
55
===============
66

7-
DAMON provides below three interfaces for different users.
7+
DAMON provides below interfaces for different users.
88

99
- *DAMON user space tool.*
1010
`This <https://github.com/awslabs/damo>`_ is for privileged people such as
@@ -14,24 +14,362 @@ DAMON provides below three interfaces for different users.
1414
virtual and physical address spaces monitoring. For more detail, please
1515
refer to its `usage document
1616
<https://github.com/awslabs/damo/blob/next/USAGE.md>`_.
17-
- *debugfs interface.*
18-
:ref:`This <debugfs_interface>` is for privileged user space programmers who
17+
- *sysfs interface.*
18+
:ref:`This <sysfs_interface>` is for privileged user space programmers who
1919
want more optimized use of DAMON. Using this, users can use DAMON’s major
20-
features by reading from and writing to special debugfs files. Therefore,
21-
you can write and use your personalized DAMON debugfs wrapper programs that
22-
reads/writes the debugfs files instead of you. The `DAMON user space tool
20+
features by reading from and writing to special sysfs files. Therefore,
21+
you can write and use your personalized DAMON sysfs wrapper programs that
22+
reads/writes the sysfs files instead of you. The `DAMON user space tool
2323
<https://github.com/awslabs/damo>`_ is one example of such programs. It
2424
supports both virtual and physical address spaces monitoring. Note that this
2525
interface provides only simple :ref:`statistics <damos_stats>` for the
2626
monitoring results. For detailed monitoring results, DAMON provides a
2727
:ref:`tracepoint <tracepoint>`.
28+
- *debugfs interface.*
29+
:ref:`This <debugfs_interface>` is almost identical to :ref:`sysfs interface
30+
<sysfs_interface>`. This will be removed after next LTS kernel is released,
31+
so users should move to the :ref:`sysfs interface <sysfs_interface>`.
2832
- *Kernel Space Programming Interface.*
2933
:doc:`This </vm/damon/api>` is for kernel space programmers. Using this,
3034
users can utilize every feature of DAMON most flexibly and efficiently by
3135
writing kernel space DAMON application programs for you. You can even extend
3236
DAMON for various address spaces. For detail, please refer to the interface
3337
:doc:`document </vm/damon/api>`.
3438

39+
.. _sysfs_interface:
40+
41+
sysfs Interface
42+
===============
43+
44+
DAMON sysfs interface is built when ``CONFIG_DAMON_SYSFS`` is defined. It
45+
creates multiple directories and files under its sysfs directory,
46+
``<sysfs>/kernel/mm/damon/``. You can control DAMON by writing to and reading
47+
from the files under the directory.
48+
49+
For a short example, users can monitor the virtual address space of a given
50+
workload as below. ::
51+
52+
# cd /sys/kernel/mm/damon/admin/
53+
# echo 1 > kdamonds/nr && echo 1 > kdamonds/0/contexts/nr
54+
# echo vaddr > kdamonds/0/contexts/0/operations
55+
# echo 1 > kdamonds/0/contexts/0/targets/nr
56+
# echo $(pidof <workload>) > kdamonds/0/contexts/0/targets/0/pid
57+
# echo on > kdamonds/0/state
58+
59+
Files Hierarchy
60+
---------------
61+
62+
The files hierarchy of DAMON sysfs interface is shown below. In the below
63+
figure, parents-children relations are represented with indentations, each
64+
directory is having ``/`` suffix, and files in each directory are separated by
65+
comma (","). ::
66+
67+
/sys/kernel/mm/damon/admin
68+
│ kdamonds/nr_kdamonds
69+
│ │ 0/state,pid
70+
│ │ │ contexts/nr_contexts
71+
│ │ │ │ 0/operations
72+
│ │ │ │ │ monitoring_attrs/
73+
│ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
74+
│ │ │ │ │ │ nr_regions/min,max
75+
│ │ │ │ │ targets/nr_targets
76+
│ │ │ │ │ │ 0/pid_target
77+
│ │ │ │ │ │ │ regions/nr_regions
78+
│ │ │ │ │ │ │ │ 0/start,end
79+
│ │ │ │ │ │ │ │ ...
80+
│ │ │ │ │ │ ...
81+
│ │ │ │ │ schemes/nr_schemes
82+
│ │ │ │ │ │ 0/action
83+
│ │ │ │ │ │ │ access_pattern/
84+
│ │ │ │ │ │ │ │ sz/min,max
85+
│ │ │ │ │ │ │ │ nr_accesses/min,max
86+
│ │ │ │ │ │ │ │ age/min,max
87+
│ │ │ │ │ │ │ quotas/ms,bytes,reset_interval_ms
88+
│ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
89+
│ │ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low
90+
│ │ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds
91+
│ │ │ │ │ │ ...
92+
│ │ │ │ ...
93+
│ │ ...
94+
95+
Root
96+
----
97+
98+
The root of the DAMON sysfs interface is ``<sysfs>/kernel/mm/damon/``, and it
99+
has one directory named ``admin``. The directory contains the files for
100+
privileged user space programs' control of DAMON. User space tools or deamons
101+
having the root permission could use this directory.
102+
103+
kdamonds/
104+
---------
105+
106+
The monitoring-related information including request specifications and results
107+
are called DAMON context. DAMON executes each context with a kernel thread
108+
called kdamond, and multiple kdamonds could run in parallel.
109+
110+
Under the ``admin`` directory, one directory, ``kdamonds``, which has files for
111+
controlling the kdamonds exist. In the beginning, this directory has only one
112+
file, ``nr_kdamonds``. Writing a number (``N``) to the file creates the number
113+
of child directories named ``0`` to ``N-1``. Each directory represents each
114+
kdamond.
115+
116+
kdamonds/<N>/
117+
-------------
118+
119+
In each kdamond directory, two files (``state`` and ``pid``) and one directory
120+
(``contexts``) exist.
121+
122+
Reading ``state`` returns ``on`` if the kdamond is currently running, or
123+
``off`` if it is not running. Writing ``on`` or ``off`` makes the kdamond be
124+
in the state. Writing ``update_schemes_stats`` to ``state`` file updates the
125+
contents of stats files for each DAMON-based operation scheme of the kdamond.
126+
For details of the stats, please refer to :ref:`stats section
127+
<sysfs_schemes_stats>`.
128+
129+
If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread.
130+
131+
``contexts`` directory contains files for controlling the monitoring contexts
132+
that this kdamond will execute.
133+
134+
kdamonds/<N>/contexts/
135+
----------------------
136+
137+
In the beginning, this directory has only one file, ``nr_contexts``. Writing a
138+
number (``N``) to the file creates the number of child directories named as
139+
``0`` to ``N-1``. Each directory represents each monitoring context. At the
140+
moment, only one context per kdamond is supported, so only ``0`` or ``1`` can
141+
be written to the file.
142+
143+
contexts/<N>/
144+
-------------
145+
146+
In each context directory, one file (``operations``) and three directories
147+
(``monitoring_attrs``, ``targets``, and ``schemes``) exist.
148+
149+
DAMON supports multiple types of monitoring operations, including those for
150+
virtual address space and the physical address space. You can set and get what
151+
type of monitoring operations DAMON will use for the context by writing one of
152+
below keywords to, and reading from the file.
153+
154+
- vaddr: Monitor virtual address spaces of specific processes
155+
- paddr: Monitor the physical address space of the system
156+
157+
contexts/<N>/monitoring_attrs/
158+
------------------------------
159+
160+
Files for specifying attributes of the monitoring including required quality
161+
and efficiency of the monitoring are in ``monitoring_attrs`` directory.
162+
Specifically, two directories, ``intervals`` and ``nr_regions`` exist in this
163+
directory.
164+
165+
Under ``intervals`` directory, three files for DAMON's sampling interval
166+
(``sample_us``), aggregation interval (``aggr_us``), and update interval
167+
(``update_us``) exist. You can set and get the values in micro-seconds by
168+
writing to and reading from the files.
169+
170+
Under ``nr_regions`` directory, two files for the lower-bound and upper-bound
171+
of DAMON's monitoring regions (``min`` and ``max``, respectively), which
172+
controls the monitoring overhead, exist. You can set and get the values by
173+
writing to and rading from the files.
174+
175+
For more details about the intervals and monitoring regions range, please refer
176+
to the Design document (:doc:`/vm/damon/design`).
177+
178+
contexts/<N>/targets/
179+
---------------------
180+
181+
In the beginning, this directory has only one file, ``nr_targets``. Writing a
182+
number (``N``) to the file creates the number of child directories named ``0``
183+
to ``N-1``. Each directory represents each monitoring target.
184+
185+
targets/<N>/
186+
------------
187+
188+
In each target directory, one file (``pid_target``) and one directory
189+
(``regions``) exist.
190+
191+
If you wrote ``vaddr`` to the ``contexts/<N>/operations``, each target should
192+
be a process. You can specify the process to DAMON by writing the pid of the
193+
process to the ``pid_target`` file.
194+
195+
targets/<N>/regions
196+
-------------------
197+
198+
When ``vaddr`` monitoring operations set is being used (``vaddr`` is written to
199+
the ``contexts/<N>/operations`` file), DAMON automatically sets and updates the
200+
monitoring target regions so that entire memory mappings of target processes
201+
can be covered. However, users could want to set the initial monitoring region
202+
to specific address ranges.
203+
204+
In contrast, DAMON do not automatically sets and updates the monitoring target
205+
regions when ``paddr`` monitoring operations set is being used (``paddr`` is
206+
written to the ``contexts/<N>/operations``). Therefore, users should set the
207+
monitoring target regions by themselves in the case.
208+
209+
For such cases, users can explicitly set the initial monitoring target regions
210+
as they want, by writing proper values to the files under this directory.
211+
212+
In the beginning, this directory has only one file, ``nr_regions``. Writing a
213+
number (``N``) to the file creates the number of child directories named ``0``
214+
to ``N-1``. Each directory represents each initial monitoring target region.
215+
216+
regions/<N>/
217+
------------
218+
219+
In each region directory, you will find two files (``start`` and ``end``). You
220+
can set and get the start and end addresses of the initial monitoring target
221+
region by writing to and reading from the files, respectively.
222+
223+
contexts/<N>/schemes/
224+
---------------------
225+
226+
For usual DAMON-based data access aware memory management optimizations, users
227+
would normally want the system to apply a memory management action to a memory
228+
region of a specific access pattern. DAMON receives such formalized operation
229+
schemes from the user and applies those to the target memory regions. Users
230+
can get and set the schemes by reading from and writing to files under this
231+
directory.
232+
233+
In the beginning, this directory has only one file, ``nr_schemes``. Writing a
234+
number (``N``) to the file creates the number of child directories named ``0``
235+
to ``N-1``. Each directory represents each DAMON-based operation scheme.
236+
237+
schemes/<N>/
238+
------------
239+
240+
In each scheme directory, four directories (``access_pattern``, ``quotas``,
241+
``watermarks``, and ``stats``) and one file (``action``) exist.
242+
243+
The ``action`` file is for setting and getting what action you want to apply to
244+
memory regions having specific access pattern of the interest. The keywords
245+
that can be written to and read from the file and their meaning are as below.
246+
247+
- ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED``
248+
- ``cold``: Call ``madvise()`` for the region with ``MADV_COLD``
249+
- ``pageout``: Call ``madvise()`` for the region with ``MADV_PAGEOUT``
250+
- ``hugepage``: Call ``madvise()`` for the region with ``MADV_HUGEPAGE``
251+
- ``nohugepage``: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE``
252+
- ``stat``: Do nothing but count the statistics
253+
254+
schemes/<N>/access_pattern/
255+
---------------------------
256+
257+
The target access pattern of each DAMON-based operation scheme is constructed
258+
with three ranges including the size of the region in bytes, number of
259+
monitored accesses per aggregate interval, and number of aggregated intervals
260+
for the age of the region.
261+
262+
Under the ``access_pattern`` directory, three directories (``sz``,
263+
``nr_accesses``, and ``age``) each having two files (``min`` and ``max``)
264+
exist. You can set and get the access pattern for the given scheme by writing
265+
to and reading from the ``min`` and ``max`` files under ``sz``,
266+
``nr_accesses``, and ``age`` directories, respectively.
267+
268+
schemes/<N>/quotas/
269+
-------------------
270+
271+
Optimal ``target access pattern`` for each ``action`` is workload dependent, so
272+
not easy to find. Worse yet, setting a scheme of some action too aggressive
273+
can cause severe overhead. To avoid such overhead, users can limit time and
274+
size quota for each scheme. In detail, users can ask DAMON to try to use only
275+
up to specific time (``time quota``) for applying the action, and to apply the
276+
action to only up to specific amount (``size quota``) of memory regions having
277+
the target access pattern within a given time interval (``reset interval``).
278+
279+
When the quota limit is expected to be exceeded, DAMON prioritizes found memory
280+
regions of the ``target access pattern`` based on their size, access frequency,
281+
and age. For personalized prioritization, users can set the weights for the
282+
three properties.
283+
284+
Under ``quotas`` directory, three files (``ms``, ``bytes``,
285+
``reset_interval_ms``) and one directory (``weights``) having three files
286+
(``sz_permil``, ``nr_accesses_permil``, and ``age_permil``) in it exist.
287+
288+
You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
289+
``reset interval`` in milliseconds by writing the values to the three files,
290+
respectively. You can also set the prioritization weights for size, access
291+
frequency, and age in per-thousand unit by writing the values to the three
292+
files under the ``weights`` directory.
293+
294+
schemes/<N>/watermarks/
295+
-----------------------
296+
297+
To allow easy activation and deactivation of each scheme based on system
298+
status, DAMON provides a feature called watermarks. The feature receives five
299+
values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The
300+
``metric`` is the system metric such as free memory ratio that can be measured.
301+
If the metric value of the system is higher than the value in ``high`` or lower
302+
than ``low`` at the memoent, the scheme is deactivated. If the value is lower
303+
than ``mid``, the scheme is activated.
304+
305+
Under the watermarks directory, five files (``metric``, ``interval_us``,
306+
``high``, ``mid``, and ``low``) for setting each value exist. You can set and
307+
get the five values by writing to the files, respectively.
308+
309+
Keywords and meanings of those that can be written to the ``metric`` file are
310+
as below.
311+
312+
- none: Ignore the watermarks
313+
- free_mem_rate: System's free memory rate (per thousand)
314+
315+
The ``interval`` should written in microseconds unit.
316+
317+
.. _sysfs_schemes_stats:
318+
319+
schemes/<N>/stats/
320+
------------------
321+
322+
DAMON counts the total number and bytes of regions that each scheme is tried to
323+
be applied, the two numbers for the regions that each scheme is successfully
324+
applied, and the total number of the quota limit exceeds. This statistics can
325+
be used for online analysis or tuning of the schemes.
326+
327+
The statistics can be retrieved by reading the files under ``stats`` directory
328+
(``nr_tried``, ``sz_tried``, ``nr_applied``, ``sz_applied``, and
329+
``qt_exceeds``), respectively. The files are not updated in real time, so you
330+
should ask DAMON sysfs interface to updte the content of the files for the
331+
stats by writing a special keyword, ``update_schemes_stats`` to the relevant
332+
``kdamonds/<N>/state`` file.
333+
334+
Example
335+
~~~~~~~
336+
337+
Below commands applies a scheme saying "If a memory region of size in [4KiB,
338+
8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate
339+
interval in [10, 20], page out the region. For the paging out, use only up to
340+
10ms per second, and also don't page out more than 1GiB per second. Under the
341+
limitation, page out memory regions having longer age first. Also, check the
342+
free memory rate of the system every 5 seconds, start the monitoring and paging
343+
out when the free memory rate becomes lower than 50%, but stop it if the free
344+
memory rate becomes larger than 60%, or lower than 30%". ::
345+
346+
# cd <sysfs>/kernel/mm/damon/admin
347+
# # populate directories
348+
# echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts;
349+
# echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes
350+
# cd kdamonds/0/contexts/0/schemes/0
351+
# # set the basic access pattern and the action
352+
# echo 4096 > access_patterns/sz/min
353+
# echo 8192 > access_patterns/sz/max
354+
# echo 0 > access_patterns/nr_accesses/min
355+
# echo 5 > access_patterns/nr_accesses/max
356+
# echo 10 > access_patterns/age/min
357+
# echo 20 > access_patterns/age/max
358+
# echo pageout > action
359+
# # set quotas
360+
# echo 10 > quotas/ms
361+
# echo $((1024*1024*1024)) > quotas/bytes
362+
# echo 1000 > quotas/reset_interval_ms
363+
# # set watermark
364+
# echo free_mem_rate > watermarks/metric
365+
# echo 5000000 > watermarks/interval_us
366+
# echo 600 > watermarks/high
367+
# echo 500 > watermarks/mid
368+
# echo 300 > watermarks/low
369+
370+
Please note that it's highly recommended to use user space tools like `damo
371+
<https://github.com/awslabs/damo>`_ rather than manually reading and writing
372+
the files as above. Above is only for an example.
35373

36374
.. _debugfs_interface:
37375

0 commit comments

Comments
 (0)