Skip to content

Commit f747ab3

Browse files
authored
Merge pull request ceph#53908 from zdover23/wip-doc-2023-10-10-troubleshooting-troubleshooting-memory-profiling
doc/rados: edit memory-profiling.rst Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 parents e6f7f80 + 3be9324 commit f747ab3

File tree

1 file changed

+94
-62
lines changed

1 file changed

+94
-62
lines changed

doc/rados/troubleshooting/memory-profiling.rst

Lines changed: 94 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,23 @@
22
Memory Profiling
33
==================
44

5-
Ceph MON, OSD and MDS can generate heap profiles using
6-
``tcmalloc``. To generate heap profiles, ensure you have
7-
``google-perftools`` installed::
5+
Ceph Monitor, OSD, and MDS can report ``TCMalloc`` heap profiles. Install
6+
``google-perftools`` if you want to generate these. Your OS distribution might
7+
package this under a different name (for example, ``gperftools``), and your OS
8+
distribution might use a different package manager. Run a command similar to
9+
this one to install ``google-perftools``:
810

9-
sudo apt-get install google-perftools
11+
.. prompt:: bash
1012

11-
The profiler dumps output to your ``log file`` directory (i.e.,
12-
``/var/log/ceph``). See `Logging and Debugging`_ for details.
13-
To view the profiler logs with Google's performance tools, execute the
14-
following::
13+
sudo apt-get install google-perftools
14+
15+
The profiler dumps output to your ``log file`` directory (``/var/log/ceph``).
16+
See `Logging and Debugging`_ for details.
17+
18+
To view the profiler logs with Google's performance tools, run the following
19+
command:
20+
21+
.. prompt:: bash
1522

1623
google-pprof --text {path-to-daemon} {log-path/filename}
1724

@@ -48,9 +55,9 @@ For example::
4855
0.0 0.4% 99.2% 0.0 0.6% decode_message
4956
...
5057

51-
Another heap dump on the same daemon will add another file. It is
52-
convenient to compare to a previous heap dump to show what has grown
53-
in the interval. For instance::
58+
Performing another heap dump on the same daemon creates another file. It is
59+
convenient to compare the new file to a file created by a previous heap dump to
60+
show what has grown in the interval. For example::
5461

5562
$ google-pprof --text --base out/osd.0.profile.0001.heap \
5663
ceph-osd out/osd.0.profile.0003.heap
@@ -60,112 +67,137 @@ in the interval. For instance::
6067
0.0 0.9% 97.7% 0.0 26.1% ReplicatedPG::do_op
6168
0.0 0.8% 98.5% 0.0 0.8% __gnu_cxx::new_allocator::allocate
6269

63-
Refer to `Google Heap Profiler`_ for additional details.
70+
See `Google Heap Profiler`_ for additional details.
71+
72+
After you have installed the heap profiler, start your cluster and begin using
73+
the heap profiler. You can enable or disable the heap profiler at runtime, or
74+
ensure that it runs continuously. When running commands based on the examples
75+
that follow, do the following:
6476

65-
Once you have the heap profiler installed, start your cluster and
66-
begin using the heap profiler. You may enable or disable the heap
67-
profiler at runtime, or ensure that it runs continuously. For the
68-
following commandline usage, replace ``{daemon-type}`` with ``mon``,
69-
``osd`` or ``mds``, and replace ``{daemon-id}`` with the OSD number or
70-
the MON or MDS id.
77+
#. replace ``{daemon-type}`` with ``mon``, ``osd`` or ``mds``
78+
#. replace ``{daemon-id}`` with the OSD number or the MON ID or the MDS ID
7179

7280

7381
Starting the Profiler
7482
---------------------
7583

76-
To start the heap profiler, execute the following::
84+
To start the heap profiler, run a command of the following form:
7785

78-
ceph tell {daemon-type}.{daemon-id} heap start_profiler
86+
.. prompt:: bash
7987

80-
For example::
88+
ceph tell {daemon-type}.{daemon-id} heap start_profiler
8189

82-
ceph tell osd.1 heap start_profiler
90+
For example:
8391

84-
Alternatively the profile can be started when the daemon starts
85-
running if the ``CEPH_HEAP_PROFILER_INIT=true`` variable is found in
86-
the environment.
92+
.. prompt:: bash
93+
94+
ceph tell osd.1 heap start_profiler
95+
96+
Alternatively, if the ``CEPH_HEAP_PROFILER_INIT=true`` variable is found in the
97+
environment, the profile will be started when the daemon starts running.
8798

8899
Printing Stats
89100
--------------
90101

91-
To print out statistics, execute the following::
102+
To print out statistics, run a command of the following form:
103+
104+
.. prompt:: bash
105+
106+
ceph tell {daemon-type}.{daemon-id} heap stats
92107

93-
ceph tell {daemon-type}.{daemon-id} heap stats
108+
For example:
94109

95-
For example::
110+
.. prompt:: bash
96111

97-
ceph tell osd.0 heap stats
112+
ceph tell osd.0 heap stats
98113

99-
.. note:: Printing stats does not require the profiler to be running and does
100-
not dump the heap allocation information to a file.
114+
.. note:: The reporting of stats with this command does not require the
115+
profiler to be running and does not dump the heap allocation information to
116+
a file.
101117

102118

103119
Dumping Heap Information
104120
------------------------
105121

106-
To dump heap information, execute the following::
122+
To dump heap information, run a command of the following form:
107123

108-
ceph tell {daemon-type}.{daemon-id} heap dump
124+
.. prompt:: bash
109125

110-
For example::
126+
ceph tell {daemon-type}.{daemon-id} heap dump
111127

112-
ceph tell mds.a heap dump
128+
For example:
113129

114-
.. note:: Dumping heap information only works when the profiler is running.
130+
.. prompt:: bash
131+
132+
ceph tell mds.a heap dump
133+
134+
.. note:: Dumping heap information works only when the profiler is running.
115135

116136

117137
Releasing Memory
118138
----------------
119139

120-
To release memory that ``tcmalloc`` has allocated but which is not being used by
121-
the Ceph daemon itself, execute the following::
140+
To release memory that ``tcmalloc`` has allocated but which is not being used
141+
by the Ceph daemon itself, run a command of the following form:
142+
143+
.. prompt:: bash
144+
145+
ceph tell {daemon-type}{daemon-id} heap release
122146

123-
ceph tell {daemon-type}{daemon-id} heap release
147+
For example:
124148

125-
For example::
149+
.. prompt:: bash
126150

127-
ceph tell osd.2 heap release
151+
ceph tell osd.2 heap release
128152

129153

130154
Stopping the Profiler
131155
---------------------
132156

133-
To stop the heap profiler, execute the following::
157+
To stop the heap profiler, run a command of the following form:
134158

135-
ceph tell {daemon-type}.{daemon-id} heap stop_profiler
159+
.. prompt:: bash
136160

137-
For example::
161+
ceph tell {daemon-type}.{daemon-id} heap stop_profiler
138162

139-
ceph tell osd.0 heap stop_profiler
163+
For example:
164+
165+
.. prompt:: bash
166+
167+
ceph tell osd.0 heap stop_profiler
140168

141169
.. _Logging and Debugging: ../log-and-debug
142170
.. _Google Heap Profiler: http://goog-perftools.sourceforge.net/doc/heap_profiler.html
143171

144-
Alternative ways for memory profiling
145-
-------------------------------------
172+
Alternative Methods of Memory Profiling
173+
----------------------------------------
146174

147175
Running Massif heap profiler with Valgrind
148176
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149177

150-
The Massif heap profiler tool can be used with Valgrind to
151-
measure how much heap memory is used and is good for
152-
troubleshooting for example Ceph RadosGW.
178+
The Massif heap profiler tool can be used with Valgrind to measure how much
179+
heap memory is used. This method is well-suited to troubleshooting RadosGW.
180+
181+
See the `Massif documentation
182+
<https://valgrind.org/docs/manual/ms-manual.html>`_ for more information.
183+
184+
Install Valgrind from the package manager for your distribution then start the
185+
Ceph daemon you want to troubleshoot:
153186

154-
See the `Massif documentation <https://valgrind.org/docs/manual/ms-manual.html>`_ for
155-
more information.
187+
.. prompt:: bash
156188

157-
Install Valgrind from the package manager for your distribution
158-
then start the Ceph daemon you want to troubleshoot::
189+
sudo -u ceph valgrind --max-threads=1024 --tool=massif /usr/bin/radosgw -f --cluster ceph --name NAME --setuser ceph --setgroup ceph
159190

160-
sudo -u ceph valgrind --max-threads=1024 --tool=massif /usr/bin/radosgw -f --cluster ceph --name NAME --setuser ceph --setgroup ceph
191+
When this command has completed its run, a file with a name of the form
192+
``massif.out.<pid>`` will be saved in your current working directory. To run
193+
the command above, the user who runs it must have write permissions in the
194+
current directory.
161195

162-
A file similar to ``massif.out.<pid>`` will be saved when it exits
163-
in your current working directory. The user running the process above
164-
must have write permissions in the current directory.
196+
Run the ``ms_print`` command to get a graph and statistics from the collected
197+
data in the ``massif.out.<pid>`` file:
165198

166-
You can then run the ``ms_print`` command to get a graph and statistics
167-
from the collected data in the ``massif.out.<pid>`` file::
199+
.. prompt:: bash
168200

169-
ms_print massif.out.12345
201+
ms_print massif.out.12345
170202

171-
This output is great for inclusion in a bug report.
203+
The output of this command is helpful when submitting a bug report.

0 commit comments

Comments
 (0)