@@ -12,7 +12,7 @@ A quick note about valgrind
12
12
13
13
Valgrind is a well-known tool used under many Unix environments to debug a lot of common memory problem scenarios in
14
14
any C/C++ written software.
15
- Valgrind is a multi-tool frontend about memory debugging. The most used provided tool is called
15
+ Valgrind is a multi-tool frontend about memory debugging. The most used underlying tool is called
16
16
`"memcheck" <http://valgrind.org/docs/manual/mc-manual.html >`_. It works by
17
17
replacing every libc's heap allocation by its own, and tracks what you do with them.
18
18
You may find interest in the usage of `"massif" <http://valgrind.org/docs/manual/ms-manual.html >`_ as well: it is a
@@ -21,15 +21,16 @@ memory tracker that can be useful to understand the general heap memory usage of
21
21
.. note :: You should read `the Valgrind documentation <http://www.valgrind.org>`_ to go further. It is well written,
22
22
with tiny representative examples.
23
23
24
- For the memory allocation replacement to take place, you need to run the program you want to debug (PHP here) through
24
+ For the memory allocation replacement to take place, you need to run the program you want to analyze (PHP here) through
25
25
valgrind, aka the launched binary will be valgrind.
26
26
27
- As valgrind replaces and tracks all libc's heap allocations, it tends to slow down debugged programs a lot. You will
27
+ As valgrind replaces and tracks all libc's heap allocations, it tends to slow- down debugged programs a lot. You will
28
28
notice it in the case of PHP. Although the slow-down is not that dramatic with PHP, it can still be clearly
29
29
felt; just don't worry if you notice it, this is normal.
30
30
31
- Valgrind is not the only tool you may use, but the most common one. Dr.Memory, LeakSanitizer, Electric Fence,
32
- AddressSanitizer are other common tools.
31
+ Valgrind is not the only tool you may use, but the most common one. `Dr.Memory <http://www.drmemory.org/ >`_,
32
+ `LeakSanitizer <https://clang.llvm.org/docs/LeakSanitizer.html >`_, `Electric Fence <http://elinux.org/Electric_Fence >`_,
33
+ `AddressSanitizer <https://clang.llvm.org/docs/AddressSanitizer.html >`_ are other common tools.
33
34
34
35
Before starting
35
36
***************
@@ -58,6 +59,10 @@ debugging times:
58
59
bad seems to show on surface, valgrind is the tool to point hidden flaws ready to blow at your face once or later. Use
59
60
it, even if you think everything seems all right about your code: you could get surprised.
60
61
62
+ .. warning :: You **must** use valgrind (or any memory debugger) on your program. It is impossible to feel 100%
63
+ confident in every strong C program, not to debug memory. Memory bugs lead to harmful security issues and
64
+ program crashes, often randomly, depending on many parameters.
65
+
61
66
Memory leak detection example
62
67
*****************************
63
68
@@ -67,15 +72,15 @@ Starter
67
72
Valgrind is a full heap memory debugger. It can also debug process memory maps and functions stacks. Please, get more
68
73
informations in its documentation.
69
74
70
- Let's go to detect a memory leak, and try with an easy one, the most-common ones you'll meet::
75
+ Let's go to detect a dynamic- memory leak, and try with an easy one, the most-common ones you'll meet::
71
76
72
77
PHP_RINIT_FUNCTION(pib)
73
78
{
74
79
void *foo = emalloc(128);
75
80
}
76
81
77
82
The code above leaks 128 bytes at each request, because it doesn't have an ``efree() `` related call for such a buffer.
78
- As it is a call to emalloc(), and thus goes through :doc: `Zend Memory Manager <zend_memory_manager >`,
83
+ As it is a call to `` emalloc() `` , and thus goes through :doc: `Zend Memory Manager <zend_memory_manager >`,
79
84
that later will warn us about this leak like we saw in ZendMM chapter. Let's see as well if valgrind can notice the
80
85
leak::
81
86
@@ -106,6 +111,9 @@ At our level, "definitely lost" is what we must look at.
106
111
.. note :: For details about the different fields output by memcheck, please
107
112
`have a look <http://valgrind.org/docs/manual/mc-manual.html#mc-manual.leaks >`_ at its documentation.
108
113
114
+ .. note :: We used ``USE_ZEND_ALLOC=0`` to disable and fully bypass Zend Memory Manager. Every call to its API
115
+ (f.e, ``emalloc() ``), will lead directly to a libc call, like we can see on the calgrind output stack frames.
116
+
109
117
Valgrind caught our leak.
110
118
111
119
Easy enough, now we could generate a leak using a persistent allocation, aka a dynamic memory allocation bypassing
@@ -134,7 +142,7 @@ Caught as well.
134
142
More complex use-case
135
143
---------------------
136
144
137
- Here is a more complex setup. Can you spot the leaks in the code below? ::
145
+ Here is a more complex setup. Can you spot the leaks in the code below ? ::
138
146
139
147
static zend_array ar;
140
148
@@ -190,26 +198,27 @@ Let's fix them now::
190
198
}
191
199
192
200
We destroy the persistent array at the end of PHP process, in :doc: `MSHUTDOWN <../extensions_design/php_lifecycle >`.
193
- As when we created it, we passed it ZVAL_PTR_DTOR as a destructor, it will run that callback on any items we inserted.
194
- This is the :doc: `zval<../internal_types/zvals> ` destructor which will destroy zvals anaylizing their content. For
195
- ``IS_STRING `` types, the destructor will free the ``zend_string ``. Done.
201
+ As when we created it, we passed it `` ZVAL_PTR_DTOR `` as a destructor, it will run that callback on any items we
202
+ inserted. This is the :doc: `zval<../internal_types/zvals> ` destructor which will destroy zvals analyzing their content.
203
+ For ``IS_STRING `` types, the destructor will release the ``zend_string `` and free it if necessary . Done.
196
204
197
- .. note :: As you can see, PHP- like any C program- is full of nested pointers. The ``zend_string`` is encapsulated into
198
- a zval, itself being part as a ``zend_array ``. Leaking the array will abviously leak both the ``zval `` and the
199
- ``zend_string ``, but ``zvals `` are not heap allocated (we allocated on stack), and thus there is no leak to
200
- report about it. You should get used you the fact that forgetting one little ``free() `` leads to tons of
201
- leaks, as often, structures embeds structures embedind structures, etc...
205
+ .. note :: As you can see, PHP - like any C strong program - is full of nested pointers. The ``zend_string`` is
206
+ encapsulated into a ``zval ``, itself being part as a ``zend_array ``. Leaking the array will abviously leak
207
+ both the ``zval `` and the ``zend_string ``, but ``zvals `` are not heap allocated (we allocated on stack), and
208
+ thus there is no leak to report about it. You should get used you the fact that forgetting to release/free a
209
+ compound structure such as a ``zend_array `` leads to tons of leaks, as often, structures embeds structures
210
+ embedding structures, etc...
202
211
203
212
Buffer overflow/underflow detection
204
213
***********************************
205
214
206
215
Leaking memory is bad. It will lead your program to trigger OOM once or later, and it will slow down the host machine
207
216
dramatically as that latter gets less and less memory available as time runs. This is the syndrom of memory leaks.
208
217
209
- But there is worse: buffer out of bound access. Accessing a pointer outside the allocation limits is the root of so
218
+ But there is worse: buffer out-of-bounds access. Accessing a pointer outside the allocation limits is the root of so
210
219
many evil operations (like getting a root shell on the machine) that you should absolutely prevent them.
211
- Lighter, out of bounds access also lead to program crash by memory corruption. However, this all depends on the
212
- hardware target machine, the compiler used and options, the OS memory layout, the libc used, etc..
220
+ Lighter, out-of- bounds access also frequently lead to program crash by memory corruption. However, this all depends on
221
+ the hardware target machine, the compiler used and options, the OS memory layout, the libc used, etc... Many factors .
213
222
214
223
Thus, out-of-bounds access are very nasty, they are **bombs ** that may or may not blow up, now, or in a minute or if you
215
224
get excessively lucky they'll never blow up.
@@ -230,6 +239,9 @@ This code allocates a buffer, and on purpose writes one byte beyond and one byte
230
239
a code, you have something like one chance out of two for it to crash immediately, and then randomly. You may also have
231
240
created a security hole in PHP, but it may not be remotely exploitable (such a behavior stays uncommon).
232
241
242
+ .. warning :: Out-of-bounds access lead to undefined behavior. It is not predictable what is going to happen, but be
243
+ sure that it's bad (immediate crash), or terrifying (security issue). Remember.
244
+
233
245
Let's ask valgrind, with the exact same command line to launch it as before, nothing changes, except the output::
234
246
235
247
==12802== Invalid write of size 1
@@ -290,8 +302,6 @@ in such a scenario that could lead to an immediate crash, or later, or never? Do
290
302
291
303
Here is a second example about string concatenations::
292
304
293
- PHP_MINIT_FUNCTION(pib)
294
- {
295
305
char *foo = strdup("foo");
296
306
char *bar = strdup("bar");
297
307
@@ -305,10 +315,8 @@ Here is a second example about string concatenations::
305
315
free(foo);
306
316
free(bar);
307
317
free(foobar);
308
- }
309
318
310
- That tiny code should not be part of MINIT() as it does nothing useful and writes to *stderr *, which could not be a very
311
- cool thing to do so far. But let's assume, can you spot the problem?
319
+ Can you spot the problem?
312
320
313
321
Let's ask valgrind::
314
322
@@ -345,10 +353,14 @@ know where the next ``\0`` will be in memory, that is random.
345
353
346
354
Solution::
347
355
348
- /* note the +1 for \0 */
349
- char *foobar = malloc(strlen("foo") + strlen("bar") + 1);
356
+ size_t len = strlen("foo") + strlen("bar") + 1; /* note the +1 for \0 */
357
+ char *foobar = malloc(len);
358
+
359
+ /* ... ... same code ... ... */
360
+
361
+ foobar[len - 1] = '\0'; /* terminate the string properly */
350
362
351
- .. note :: The error described above is one of the most common on in C. They are called ' off-by-one' mistakes: you
363
+ .. note :: The error described above is one of the most common on in C. They are called ** off-by-one mistakes** : you
352
364
forget to allocate just one byte, but you will create tons of problems in the code just because of that.
353
365
354
366
Finally here is a last example to show a use-after-free scenario. This is also a very common mistake in C programming,
0 commit comments