|
1 | 1 | Zend Memory Manager
|
2 | 2 | ===================
|
3 | 3 |
|
| 4 | +Zend Memory Manager, often abbreviated as ZendMM or ZMM, is a C layer that aims to provide abilities to allocate and |
| 5 | +release dynamic **request-bound** memory. |
| 6 | + |
| 7 | +Note the "request-bound" in the above sentence. |
| 8 | + |
| 9 | +ZendMM is not just a classical layer over libc's dynamic memory allocator, mainly represented by the couple API calls |
| 10 | +``malloc()/free()``. ZendMM is about request-bound memory that PHP must allocate while treating a request. |
| 11 | + |
| 12 | +The two main kind of dynamic memory pools in PHP |
| 13 | +************************************************ |
| 14 | + |
| 15 | +PHP is a share-nothing architecture. Well, not at 100%. Let us explain. |
| 16 | + |
| 17 | +.. note:: You may need to read :doc:`the PHP lifecycle chapter <../extensions_design/php_lifecycle>` before continuing |
| 18 | + here, you'll get additionnal informations about the different steps and cycles that can be drawn from PHP |
| 19 | + lifetime. |
| 20 | + |
| 21 | +PHP can treat several (dozen, thousands ?) of requests into the same process. By default, PHP will forget anything it |
| 22 | +knows of the current request, when that later finishes. |
| 23 | + |
| 24 | +"Forgetting" things translates to freeing any dynamic buffer that got allocated while treating a request. That means |
| 25 | +that when in the process of treating a request, one must not allocate dynamic memory using traditionnal libc calls. |
| 26 | +Doing that is perfectly valid, but you give a chance to forget to free such a buffer. |
| 27 | + |
| 28 | +ZendMM comes with an API that substitute to libc's dynamic allocator, by copying its API. When in the process of |
| 29 | +treating a request, the programmer must use that API instead of libc's allocator. |
| 30 | + |
| 31 | +For example, when PHP treats a request, it will parse PHP files. Those ones will lead to functions and classes |
| 32 | +declarations, for example. When the compiler comes to compile the PHP files, it will allocate some dynamic memory to |
| 33 | +store classes and functions it discovers. But, at the end of the request, PHP will forget about those latter. By |
| 34 | +default, PHP forgets *a very huge number* of informations from one request to another. |
| 35 | + |
| 36 | +There exists however some pretty rare informations you need to persist across several requests. But that's uncommon. |
| 37 | + |
| 38 | +What could be kept unchanged throught requests ? What we call **persistent** objects. Once more let us insist : those |
| 39 | +are rare cases. For example, the current PHP executable path won't change from requests to requests. That latter |
| 40 | +information is allocated permanently, that means it is allocated with a traditionnal libc's ``malloc()`` call. |
| 41 | + |
| 42 | +What else ? Some strings. For example, the "_SERVER" string will be reused from request to request, as every request |
| 43 | +will create the ``$_SERVER`` PHP array. So the "_SERVER" string itself can be permanently allocated, because it will be |
| 44 | +allocated once |
| 45 | + |
| 46 | +What you must remember: |
| 47 | + |
| 48 | +* There exists two kinds of dynamic memory allocations while programming PHP (or extensions): |
| 49 | + * Request-bound dynamic allocations |
| 50 | + * Permanent dynamic allocations |
| 51 | + |
| 52 | +* Request-bound dynamic memory allocations |
| 53 | + * Must only be performed when PHP is treating a request (not before, nor after) |
| 54 | + * Can only be performed using the ZendMM dynamic memory allocation API |
| 55 | + * Are very common, basically 95% of your dynamic allocations will be request-bound |
| 56 | + * Are tracked by ZendMM, and you'll be informed about bad usage of the memory area, or if you leak |
| 57 | + |
| 58 | +* Permanent dynamic memory allocations |
| 59 | + * Should not be performed while PHP is treating a request (not forbidden, but a bad idea) |
| 60 | + * Are not tracked by ZendMM, and you won't be informed about bad usage of the memory area, or if you leak |
| 61 | + * Should be pretty rare in an extension |
| 62 | + |
| 63 | +Also, keep in mind that all PHP source code has been based on such a memory level. Thus, many internal structures get |
| 64 | +allocated using the Zend Memory Manager. Most of them got a "persistent" API call, which when used, lead to |
| 65 | +traditionnal libc allocation. |
| 66 | + |
| 67 | +Here is a request-bound allocated :doc:`zend_string <../internal_types/strings/zend_strings>`:: |
| 68 | + |
| 69 | + zend_string *foo = zend_string_init("foo", strlen("foo"), 0); |
| 70 | + |
| 71 | +And here is the persistent allocated one:: |
| 72 | + |
| 73 | + zend_string *foo = zend_string_init("foo", strlen("foo"), 1); |
| 74 | + |
| 75 | +Same for :doc:`HashTable <../internal_types/hashtables>`. Request-bound allocated one:: |
| 76 | + |
| 77 | + zend_array ar; |
| 78 | + zend_hash_init(&ar, 8, NULL, NULL, 0); |
| 79 | + |
| 80 | +Persistent allocated one:: |
| 81 | + |
| 82 | + zend_array ar; |
| 83 | + zend_hash_init(&ar, 8, NULL, NULL, 1); |
| 84 | + |
| 85 | +It is always the same in all the different Zend APIs. Usually, it is weither a "0" to pass as last parameter to mean |
| 86 | +"I want this structure to be allocated using ZendMM, so request-bound", or "1" meaning "I want this structure to get |
| 87 | +allocated bypassing ZendMM and using a traditionnal libc's ``malloc()`` call". |
| 88 | + |
| 89 | +Obviously, those structures provide an API that remembers how it did allocate the structure, to use the right |
| 90 | +deallocation function when destroyed. Hence in such a code:: |
| 91 | + |
| 92 | + zend_string_release(foo); |
| 93 | + zend_hash_destroy(&ar); |
| 94 | + |
| 95 | +The API knows whether those structures were allocated using request-bound allocation, or permanent one, and in the |
| 96 | +first case will use ``efree()`` to release it, and in the second case libc's ``free()``. |
| 97 | + |
| 98 | +Zend Memory Manager API |
| 99 | +*********************** |
| 100 | + |
| 101 | +The API is located into |
| 102 | +`Zend/zend_alloc.h <https://github.com/php/php-src/blob/c3b910370c5c92007c3e3579024490345cb7f9a7/Zend/zend_alloc.h>`_ |
| 103 | + |
| 104 | +The API calls are mainly C macros and not functions, so get prepared if you debug them and want to look at how they |
| 105 | +work. Those calls copy libc's calls, they usually add an "e" in the function name; So you should not be lost, and there |
| 106 | +is not many things to detail about the API. |
| 107 | + |
| 108 | +Basically what you'll use most are ``emalloc(size_t)`` and ``efree(void *)``. |
| 109 | + |
| 110 | +You are also provided with ``ecalloc(size_t nmemb, size_t size)`` that allocates ``nmemb`` of individual size ``size``, |
| 111 | +and zeroes the area. If you are a strong C programmer with experience, you should know that whenever possible, it is |
| 112 | +better to use ``ecalloc()`` over ``emalloc()`` as ``ecalloc()`` will zero out the memory area which could help a lot in |
| 113 | +pointer bug detection. Remember that ``emalloc()`` works basically like the libc ``malloc()``: it will look for a big |
| 114 | +enough area in different pools, and return you the best fit. So you may be given a recycled pointer which points to |
| 115 | +garbage. |
| 116 | + |
| 117 | +Then comes ``safe_emalloc(size_t nmemb, size_t size, size_t offset)``, which is an ``emalloc(size * nmemb + offset)`` |
| 118 | +but that does check against overflows for you. You should use this API call if the numbers you must provide come from an |
| 119 | +untrusted source, like the userland. |
| 120 | + |
| 121 | +About string facilities, ``estrdup(char *)`` and ``estrndup(char *, size_t len)`` allow to duplicate strings or binary |
| 122 | +strings. |
| 123 | + |
| 124 | +Whatever happens, pointers returned by ZendMM must be freed using ZendMM, aka ``efree()`` call and |
| 125 | +**not libc's free()**. |
| 126 | + |
| 127 | +Zend Memory Manager debugging shields |
| 128 | +************************************* |
| 129 | + |
| 130 | +ZendMM provides the following abilities: |
| 131 | + |
| 132 | +* Memory consumption management. |
| 133 | +* Memory leak tracking. |
| 134 | +* Buffer overflows or underflows. |
| 135 | + |
| 136 | +Memory consumption management |
| 137 | +----------------------------- |
| 138 | + |
| 139 | +ZendMM is the layer behind the PHP userland "memory_limit" feature. Every single byte allocated using the ZendMM layer |
| 140 | +is counted and added. When the INI's *memory_limit* is reached, you know what happens. |
| 141 | +That also mean that any allocation you perform via ZendMM is reflected in the ``memory_get_usage()`` call from PHP |
| 142 | +userland. |
| 143 | + |
| 144 | +As an extension developper, this is a good thing, because it helps mastering the PHP process' heap size. |
| 145 | + |
| 146 | +If a memory limit error is launched, the engine will bail out from the current code position to a catch block, and will |
| 147 | +terminate smoothly. But there is no chance it goes back to the location in your code where the limit blew up. |
| 148 | +You must be prepared to that. |
| 149 | + |
| 150 | +That means that in theory, ZendMM cannot return a NULL pointer to you. If the allocation fails from the OS, or if the |
| 151 | +allocation generates a memory limit error, the code will run into a catch block and won't return to you allocation call. |
| 152 | + |
| 153 | +If for any reason you need to bypass that protection, you must then use a traditionnal libc call, like ``malloc()``. |
| 154 | +Take care however and know what you do. It may happen that you need to allocate lots of memory and could blow up the PHP |
| 155 | +*memory_limit* if using ZendMM. Thus use another allocator (like libc) but take care: your extension will grow the |
| 156 | +current process heap size. That cannot be seen using ``memory_get_usage()`` in PHP, but by analyzing the current heap |
| 157 | +with the OS facilities (like */proc/{pid}/maps*) |
| 158 | + |
| 159 | +.. note:: If you need to fully disable ZendMM, you can launch PHP with the ``USE_ZEND_ALLOC=0`` env var. This way, every |
| 160 | + call to the ZendMM API (like ``emalloc()``) will be directed to a libc call, and ZendMM will be disabled. |
| 161 | + This is especially useful when :doc:`debugging memory <./memory_debugging>`. |
| 162 | + |
| 163 | +Memory leak tracking |
| 164 | +-------------------- |
| 165 | + |
| 166 | +Remember the main ZendMM rules: it starts when a request starts, it then expects you call its API when in need of |
| 167 | +dynamic memory as you are treating a request. When the current request ends, ZendMM shuts down. |
| 168 | + |
| 169 | +By shutting down, it will browse every of its active pointer, and if using |
| 170 | +:doc:`a debug build<../build_system/building_php>` of PHP, it will warn you about memory leaking. |
| 171 | + |
| 172 | +Let's be clear here: if at the end of the current request ZendMM finds some active memory blocks, that means those are |
| 173 | +leaking. There should not be any active memory block living onto ZendMM heap at the end of the request, as anyone who |
| 174 | +allocated some should have freed them. |
| 175 | + |
| 176 | +If you forget to free blocks, they will all get displayed on *stderr*. This process of memory leak reporting only works |
| 177 | +in the following conditions: |
| 178 | + |
| 179 | +* You are using :doc:`a debug build<../build_system/building_php>` of PHP |
| 180 | +* You have report_memleaks=On in php.ini (default) |
| 181 | + |
| 182 | +Here is an example of a simple leak into an extension:: |
| 183 | + |
| 184 | + PHP_RINIT_FUNCTION(example) |
| 185 | + { |
| 186 | + void *foo = emalloc(128); |
| 187 | + } |
| 188 | + |
| 189 | +When launching PHP with that extension activated, on a debug build, that generates on stderr:: |
| 190 | + |
| 191 | + [Fri Jun 9 16:04:59 2017] Script: '/tmp/foobar.php' |
| 192 | + /path/to/extension/file.c(123) : Freeing 0x00007fffeee65000 (128 bytes), script=/tmp/foobar.php |
| 193 | + === Total 1 memory leaks detected === |
| 194 | + |
| 195 | +Those lines are generated when the Zend Memory Manager shuts down, that is at the end of each treated request. |
| 196 | + |
| 197 | +Beware however: |
| 198 | + |
| 199 | +* Obviously ZendMM doesn't know anything about persistent allocations, or allocations that were perform in another way |
| 200 | + than using it. Hence, ZendMM can only warn you about allocations it is aware of, every traditionnal libc allocation |
| 201 | + won't be reported in here. |
| 202 | +* If PHP shuts down in an incorrect maner (what we call an unclean shutdown), ZendMM will report tons of leaks. This is |
| 203 | + because when incorrectly shutdown, the engine uses a longjmp() call to a catch block, preventing every code that cleans |
| 204 | + memory to fire-in. Thus, many leaks get reported. This happens especially after a call to PHP's exit()/die(), or if a |
| 205 | + fatal error gets triggered in some critical parts of PHP. |
| 206 | +* If you use a non-debug build of PHP, nothing shows, ZendMM is dumb. |
| 207 | + |
| 208 | +What you must remember is that ZendMM leak tracking is a nice bonus tool to have, but it does not replace a |
| 209 | +:doc:`true C memory debugger <./memory_debugging>`. |
| 210 | + |
| 211 | + |
| 212 | +Buffer overflows or underflows |
| 213 | +------------------------------ |
| 214 | + |
| 215 | +Zend Memory Manager engine |
| 216 | +************************** |
| 217 | + |
| 218 | +ZendMM substitutes to libc's API by providing a very similar one. That API should only be used when treating requests. |
| 219 | + |
| 220 | +ZendMM encapsulates libc's allocator, and like this later, it asks for memory, arange the memory areas, sticks header |
| 221 | +and canary blocks against it, and gives you back the buffer you asked. |
0 commit comments