|
| 1 | +.. _zms_api: |
| 2 | + |
| 3 | +Zephyr Memory Storage (ZMS) |
| 4 | +########################### |
| 5 | +Zephyr Memory Storage is a new key/value storage system that is designed to work with all types |
| 6 | +of non-volatile technologies. It supports classical on-chip NOR flash as well as new technologies |
| 7 | +like RRAM and MRAM that do not require a separate erase operation at all. |
| 8 | +Data for these devices can be overwritten directly at any time. |
| 9 | + |
| 10 | +General behavior |
| 11 | +**************** |
| 12 | +ZMS divides the memory space into sectors (minimum 2), and each sector is filled with key/value |
| 13 | +pair until it is full. |
| 14 | + |
| 15 | +Header entries and IDs entries are stored in the bottom of the sectors and are called ATE |
| 16 | +(Allocation Table Entry). |
| 17 | +When a sector is full we verify first that the following sector is empty, we garbage collect |
| 18 | +the N+2 sector (where N is the current sector number) by moving the valid ATEs to the N+1 empty |
| 19 | +sector, we erase the garbage collected sector and then we close the current sector by writing a |
| 20 | +garbage_collect_done ATE and the close ATE (one of the header entries). |
| 21 | +Afterwards we move forward to the next sector and start writing entries again. |
| 22 | + |
| 23 | +This behavior is repeated until it reaches the end of the partition. Then it starts again from |
| 24 | +the first sector after garbage collecting it and erasing its content. |
| 25 | + |
| 26 | +Composition of a sector |
| 27 | +======================= |
| 28 | +A sector is organized in this form (example with 3 sectors): |
| 29 | + |
| 30 | +.. list-table:: |
| 31 | + :widths: 25 25 25 |
| 32 | + :header-rows: 1 |
| 33 | + |
| 34 | + * - Sector 0 (closed) |
| 35 | + - Sector 1 (open) |
| 36 | + - Sector 2 (empty) |
| 37 | + * - Data_a0 |
| 38 | + - Data_b0 |
| 39 | + - Data_c0 |
| 40 | + * - Data_a1 |
| 41 | + - Data_b1 |
| 42 | + - Data_c1 |
| 43 | + * - Data_a2 |
| 44 | + - Data_b2 |
| 45 | + - Data_c2 |
| 46 | + * - GC_done |
| 47 | + - . |
| 48 | + - . |
| 49 | + * - . |
| 50 | + - . |
| 51 | + - . |
| 52 | + * - . |
| 53 | + - . |
| 54 | + - . |
| 55 | + * - . |
| 56 | + - ATE_b2 |
| 57 | + - ATE_c2 |
| 58 | + * - ATE_a2 |
| 59 | + - ATE_b1 |
| 60 | + - ATE_c1 |
| 61 | + * - ATE_a1 |
| 62 | + - ATE_b0 |
| 63 | + - ATE_c0 |
| 64 | + * - ATE_a0 |
| 65 | + - GC_done |
| 66 | + - GC_done |
| 67 | + * - Close (cyc=1) |
| 68 | + - Close (cyc=1) |
| 69 | + - Close (cyc=1) |
| 70 | + * - Empty (cyc=1) |
| 71 | + - Empty (cyc=2) |
| 72 | + - Empty (cyc=2) |
| 73 | + |
| 74 | +Definition of each element in the sector |
| 75 | +======================================== |
| 76 | + |
| 77 | +``Empty ATE:`` is written when erasing a sector (last position of the sector). |
| 78 | + |
| 79 | +``Close ATE:`` is written when closing a sector (second to last position of the sector). |
| 80 | + |
| 81 | +``GC_done ATE:`` is written to indicate that the next sector has been already garbage |
| 82 | +collected. This ATE could be in any position of the sector. |
| 83 | + |
| 84 | +``ATE:`` are entries that contain an ID and describe where the data is stored, its size and |
| 85 | +its crc32 |
| 86 | + |
| 87 | +``Data:`` is the actual value associated to the ATE ID |
| 88 | + |
| 89 | +How does ZMS work ? |
| 90 | +******************* |
| 91 | + |
| 92 | +Mounting the Storage system |
| 93 | +=========================== |
| 94 | + |
| 95 | +Mounting the storage starts by getting the flash parameters, checking that the file system |
| 96 | +properties are correct (sector_size, sector_count ...) then calling the zms_init function to |
| 97 | +make the storage ready. |
| 98 | + |
| 99 | +To mount the NVS filesystem some elements in the zms_fs structure must be initialized. |
| 100 | + |
| 101 | +.. code-block:: c |
| 102 | +
|
| 103 | + struct zms_fs { |
| 104 | + /** File system offset in flash **/ |
| 105 | + off_t offset; |
| 106 | +
|
| 107 | + /** Storage system is split into sectors, each sector size must be multiple of |
| 108 | + * erase-blocks if the device has erase capabilities |
| 109 | + */ |
| 110 | + uint32_t sector_size; |
| 111 | + /** Number of sectors in the file system */ |
| 112 | + uint32_t sector_count; |
| 113 | +
|
| 114 | + /** Flash device runtime structure */ |
| 115 | + const struct device *flash_device; |
| 116 | + }; |
| 117 | +
|
| 118 | +Initialization of ZMS |
| 119 | +===================== |
| 120 | + |
| 121 | +As the ZMS has a fast-forward write mechanism, we must find the last sector and the last pointer |
| 122 | +of the entry where it stopped the last time. |
| 123 | +It must look for a closed sector followed by an open one, then within the open sector, it finds |
| 124 | +(recover) the last written ATE (Allocation Table Entry). |
| 125 | +After that, it checks that the sector after this one is empty, or it will erase it. |
| 126 | + |
| 127 | +ZMS ID/data write |
| 128 | +=================== |
| 129 | + |
| 130 | +To avoid rewriting the same data with the same ID again, it must look in all the sectors if the |
| 131 | +same ID exist then compares its data, if the data is identical no write is performed. |
| 132 | +If we must perform a write, then an ATE and Data (if not a delete) are written in the sector. |
| 133 | +If the sector is full (cannot hold the current data + ATE) we have to move to the next sector, |
| 134 | +garbage collect the sector after the newly opened one then erase it. |
| 135 | +Data size that is smaller or equal to 8 bytes are written within the ATE. |
| 136 | + |
| 137 | +ZMS ID/data read (with history) |
| 138 | +=============================== |
| 139 | + |
| 140 | +By default it looks for the last data with the same ID and retrieves its data. it returns as well |
| 141 | +the number of bytes that were read. |
| 142 | +If history count is provided that is different than 0, older data with same ID is retrieved. |
| 143 | + |
| 144 | +ZMS get data length |
| 145 | +=================== |
| 146 | + |
| 147 | +Given an ID ZMS will return the size of the last data that was found with the same ID |
| 148 | + |
| 149 | +ZMS free space calculation |
| 150 | +========================== |
| 151 | + |
| 152 | +ZMS can also return the free space remaining in the partition. |
| 153 | +However, this operation is very time consuming and needs to browse all valid ATEs in all sectors |
| 154 | +of the partition and for each valid ATE try to find if an older one exist. |
| 155 | +We do not recommend applications to use this function very often at runtime as it could slow |
| 156 | +very much the calling thread |
| 157 | + |
| 158 | +ZMS how does the cycle counter works ? |
| 159 | +====================================== |
| 160 | + |
| 161 | +Each sector has a lead cycle counter which is a uin8_t that is used to validate all the other |
| 162 | +ATEs. |
| 163 | +The lead cycle counter is stored in the empty ATE. |
| 164 | +To become valid, an ATE must have the same cycle counter as the one stored in the empty ATE. |
| 165 | +Each time an ATE is moved from a sector to another it must get the cycle counter of the |
| 166 | +destination sector. |
| 167 | +To erase a sector, the cycle counter of the empty ATE is incremented and a single write of the |
| 168 | +empty ATE is done. |
| 169 | +All the ATEs in that sector become invalid. |
| 170 | + |
| 171 | +ZMS how to close a sector |
| 172 | +========================= |
| 173 | + |
| 174 | +To close a sector a close ATE is added at the end of the sector and it must have the same cycle |
| 175 | +counter as the empty ATE. |
| 176 | +When closing a sector, all the remaining space that has not been used is filled with garbage data |
| 177 | +to avoid having old ATEs with a valid cycle counter. |
| 178 | + |
| 179 | +ZMS triggering Garbage collector |
| 180 | +================================ |
| 181 | + |
| 182 | +Some applications need to make sure that storage writes have a maximum defined latency. |
| 183 | +When calling a ZMS write, the current sector could be almost full and we need to trigger the GC |
| 184 | +to switch to the next sector. |
| 185 | +This operation is time consuming and it will cause some applications to not meet their real time |
| 186 | +constraints. |
| 187 | +ZMS adds an API for the application to get the current remaining free space in a sector. |
| 188 | +The application could then decide when needed to switch to the next sector if the current one is |
| 189 | +almost full and of course it will trigger the garbage collection on the next sector. |
| 190 | +This will guarantee the application that the next write won't trigger the garbage collection. |
| 191 | + |
| 192 | +ZMS structure of ATE (Allocation Table Entries) |
| 193 | +=============================================== |
| 194 | + |
| 195 | +An entry has 16 bytes divided between these variables : |
| 196 | + |
| 197 | +.. code-block:: c |
| 198 | +
|
| 199 | + struct zms_ate { |
| 200 | + uint8_t crc8; /* crc8 check of the entry */ |
| 201 | + uint8_t cycle_cnt; /* cycle counter for non erasable devices */ |
| 202 | + uint32_t id; /* data id */ |
| 203 | + uint16_t len; /* data len within sector */ |
| 204 | + union { |
| 205 | + uint8_t data[8]; /* used to store small size data */ |
| 206 | + struct { |
| 207 | + uint32_t offset; /* data offset within sector */ |
| 208 | + union { |
| 209 | + uint32_t data_crc; /* crc for data */ |
| 210 | + uint32_t metadata; /* Used to store metadata information |
| 211 | + * such as storage version. |
| 212 | + */ |
| 213 | + }; |
| 214 | + }; |
| 215 | + }; |
| 216 | + } __packed; |
| 217 | +
|
| 218 | +.. note:: The data CRC is checked only when the whole data of the element is read. |
| 219 | + The data CRC is not checked for a partial read, as it is computed for the complete set of data. |
| 220 | + |
| 221 | +.. note:: Enabling the data CRC feature on a previously existing ZMS content without |
| 222 | + data CRC will make all existing data invalid. |
| 223 | + |
| 224 | +.. _free-space: |
| 225 | + |
| 226 | +How much space is available for Key/value pairs |
| 227 | +*********************************************** |
| 228 | + |
| 229 | +For both scenarios ZMS should have always an empty sector to be able to perform the garbage |
| 230 | +collection. |
| 231 | +So if we suppose that 4 sectors exist in a partition, ZMS will only use 3 sectors to store |
| 232 | +Key/value pairs and keep always one (rotating sector) empty to be able to launch GC. |
| 233 | + |
| 234 | +.. note:: The maximum single data length that could be written at once in a sector is 64K |
| 235 | + (This could change in future versions of ZMS) |
| 236 | + |
| 237 | +Data <= 8 bytes |
| 238 | +=============== |
| 239 | + |
| 240 | +For small sized value (< 8 bytes), the data is stored within the entry (ATE) itself and no data |
| 241 | +is written at the top of the sector. |
| 242 | +ZMS has an entry size of 16 bytes which means that the free space in a partition to store data |
| 243 | +is computed in this scenario as :: |
| 244 | + |
| 245 | +(NUM_SECTORS - 1) * (SECTOR_SIZE - (5 * ATE_SIZE)) / 2 |
| 246 | + |
| 247 | +Where: |
| 248 | + |
| 249 | +``NUM_SECTOR:`` Total number of sectors |
| 250 | + |
| 251 | +``SECTOR_SIZE:`` Size of the sector |
| 252 | + |
| 253 | +``ATE_SIZE:`` 16 bytes |
| 254 | + |
| 255 | +``(5 * ATE_SIZE):`` Reserved ATEs for header and delete items |
| 256 | + |
| 257 | +For example for 4 sectors of 1024 bytes, free space for data is 3 * (944)/2 = 1416 bytes. |
| 258 | + |
| 259 | +Data > 8 bytes |
| 260 | +============== |
| 261 | + |
| 262 | +Data is stored separately at the top of the sector. |
| 263 | +In this case it is hard to estimate the free available space as this depends on the size of |
| 264 | +the data. But we can take into account that for N bytes of data (N > 8 bytes) an additional |
| 265 | +16 bytes of ATE must be added at the bottom of the sector. |
| 266 | + |
| 267 | +Let's take an example: |
| 268 | + |
| 269 | +For a partition that has 4 sectors of 1024 bytes and for data size of 64 bytes. |
| 270 | +Only 3 sectors are available for writes with a capacity of 944 bytes each. |
| 271 | +Each Key/value pair needs an extra 16 bytes for ATE which makes it possible to store 11 pairs |
| 272 | +in each sectors (944 / 80). |
| 273 | +Total data that could be stored in this partition for this case is 11 * 3 * 64 = 2112 bytes |
| 274 | + |
| 275 | +ZMS wear leveling feature |
| 276 | +************************* |
| 277 | + |
| 278 | +This storage system is optimized for devices that do not require an erase. |
| 279 | +Using storage systems that rely on an erase-value (NVS as an example) will need to emulate the |
| 280 | +erase with write operations. This will cause a significant decrease in the life expectancy of |
| 281 | +these devices and will cause more delays for write operations and for initialization. |
| 282 | +ZMS introduces a cycle count mechanism that avoids emulating erase operation for these devices. |
| 283 | +It also guarantees that every memory location is written only once for each cycle of sector write. |
| 284 | + |
| 285 | +As an example, to erase a 4096 bytes sector on a non erasable device using NVS, 256 flash writes |
| 286 | +must be performed (supposing that write-block-size=16 bytes), while using ZMS only 1 write of |
| 287 | +16 bytes is needed. This operation is 256 times faster in this case. |
| 288 | + |
| 289 | +Garbage collection operation is also adding some writes to the memory cell life expectancy as it |
| 290 | +is moving some blocks from one sector to another. |
| 291 | +To make the garbage collector not affect the life expectancy of the device it is recommended |
| 292 | +to dimension correctly the partition size. Its size should be the double of the maximum size of |
| 293 | +data (including extra headers) that could be written in the storage. |
| 294 | + |
| 295 | +See :ref:`free-space`. |
| 296 | + |
| 297 | +How to compute device lifetime |
| 298 | +============================== |
| 299 | + |
| 300 | +Storage devices whether they are classical Flash or new technologies like RRAM/MRAM has a limited |
| 301 | +life expectancy which is determined by the number of times memory cells can be erased/written. |
| 302 | +Flash devices are erased one page at a time as part of their functional behavior (otherwise |
| 303 | +memory cells cannot be overwritten) and for non erasable storage devices memory cells can be |
| 304 | +overwritten directly. |
| 305 | + |
| 306 | +A typical scenario is shown here to calculate the life expectancy of a device. |
| 307 | +Let's suppose that we store an 8 bytes variable using the same ID but its content changes every |
| 308 | +minute. The partition has 4 sectors with 1024 bytes each. |
| 309 | +Each write of the variable requires 16 bytes of storage. |
| 310 | +As we have 944 bytes available for ATEs for each sector, and because ZMS is a fast-forward |
| 311 | +storage system, we are going to rewrite the first location of the first sector after |
| 312 | +(944 * 4) / 16 = 236 minutes. |
| 313 | + |
| 314 | +In addition to the normal writes, garbage collector will move the still valid data from old |
| 315 | +sectors to new ones. |
| 316 | +As we are using the same ID and a big partition size, no data will be moved by the garbage |
| 317 | +collector in this case. |
| 318 | +For storage devices that could be written 20000 times, the storage will last about |
| 319 | +4.720.000 minutes (~9 years). |
| 320 | + |
| 321 | +To make a more general formula we must first compute the effective used size in ZMS by our |
| 322 | +typical set of data. |
| 323 | +For id/data pair with data <= 8 bytes, effective_size is 16 bytes |
| 324 | +For id/data pair with data > 8 bytes, effective_size is 16 bytes + sizeof(data) |
| 325 | +Let's suppose that total_effective_size is the total size of the set of data that is written in |
| 326 | +the storage and that the partition is well dimensioned (double of the effective size) to avoid |
| 327 | +having the garbage collector moving blocks all the time. |
| 328 | + |
| 329 | +The expected life of the device in minutes is computed as :: |
| 330 | + |
| 331 | +(SECTOR_EFFECTIVE_SIZE * SECTOR_NUMBER * MAX_NUM_WRITES) / (TOTAL_EFFECTIVE_SIZE * WR_MIN) |
| 332 | + |
| 333 | +Where: |
| 334 | + |
| 335 | +``SECTOR_EFFECTIVE_SIZE``: is the size sector - header_size(80 bytes) |
| 336 | + |
| 337 | +``SECTOR_NUMBER``: is the number of sectors |
| 338 | + |
| 339 | +``MAX_NUM_WRITES``: is the life expectancy of the storage device in number of writes |
| 340 | + |
| 341 | +``TOTAL_EFFECTIVE_SIZE``: Total effective size of the set of written data |
| 342 | + |
| 343 | +``WR_MIN``: Number of writes of the set of data per minute |
| 344 | + |
| 345 | +Sample |
| 346 | +****** |
| 347 | + |
| 348 | +A sample of how ZMS can be used is supplied in ``samples/subsys/fs/zms``. |
| 349 | + |
| 350 | +API Reference |
| 351 | +************* |
| 352 | + |
| 353 | +The ZMS subsystem APIs are provided by ``zms.h``: |
| 354 | + |
| 355 | +.. doxygengroup:: zms_data_structures |
| 356 | + |
| 357 | +.. doxygengroup:: zms_high_level_api |
| 358 | + |
| 359 | +.. comment |
| 360 | + not documenting .. doxygengroup:: zms |
0 commit comments