Skip to content

Commit 26ade62

Browse files
committed
Merge commit '0171b57a04c3eb6444fdf1163e0e21993445bfd8'
2 parents adbd029 + 0171b57 commit 26ade62

File tree

3 files changed

+111
-30
lines changed

3 files changed

+111
-30
lines changed

littlefs/DESIGN.md

Lines changed: 86 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -290,39 +290,40 @@ The path to data block 0 is even more quick, requiring only two jumps:
290290

291291
We can find the runtime complexity by looking at the path to any block from
292292
the block containing the most pointers. Every step along the path divides
293-
the search space for the block in half. This gives us a runtime of O(log n).
293+
the search space for the block in half. This gives us a runtime of O(logn).
294294
To get to the block with the most pointers, we can perform the same steps
295-
backwards, which keeps the asymptotic runtime at O(log n). The interesting
295+
backwards, which puts the runtime at O(2logn) = O(logn). The interesting
296296
part about this data structure is that this optimal path occurs naturally
297297
if we greedily choose the pointer that covers the most distance without passing
298298
our target block.
299299

300300
So now we have a representation of files that can be appended trivially with
301-
a runtime of O(1), and can be read with a worst case runtime of O(n logn).
301+
a runtime of O(1), and can be read with a worst case runtime of O(nlogn).
302302
Given that the the runtime is also divided by the amount of data we can store
303303
in a block, this is pretty reasonable.
304304

305305
Unfortunately, the CTZ skip-list comes with a few questions that aren't
306306
straightforward to answer. What is the overhead? How do we handle more
307-
pointers than we can store in a block?
307+
pointers than we can store in a block? How do we store the skip-list in
308+
a directory entry?
308309

309310
One way to find the overhead per block is to look at the data structure as
310311
multiple layers of linked-lists. Each linked-list skips twice as many blocks
311-
as the previous linked-list. Or another way of looking at it is that each
312+
as the previous linked-list. Another way of looking at it is that each
312313
linked-list uses half as much storage per block as the previous linked-list.
313314
As we approach infinity, the number of pointers per block forms a geometric
314315
series. Solving this geometric series gives us an average of only 2 pointers
315316
per block.
316317

317-
![overhead per block](https://latex.codecogs.com/gif.latex?%5Clim_%7Bn%5Cto%5Cinfty%7D%5Cfrac%7B1%7D%7Bn%7D%5Csum_%7Bi%3D0%7D%5E%7Bn%7D%5Cleft%28%5Ctext%7Bctz%7D%28i%29+1%5Cright%29%20%3D%20%5Csum_%7Bi%3D0%7D%5E%7B%5Cinfty%7D%5Cfrac%7B1%7D%7B2%5Ei%7D%20%3D%202)
318+
![overhead_per_block](https://latex.codecogs.com/svg.latex?%5Clim_%7Bn%5Cto%5Cinfty%7D%5Cfrac%7B1%7D%7Bn%7D%5Csum_%7Bi%3D0%7D%5E%7Bn%7D%5Cleft%28%5Ctext%7Bctz%7D%28i%29+1%5Cright%29%20%3D%20%5Csum_%7Bi%3D0%7D%5Cfrac%7B1%7D%7B2%5Ei%7D%20%3D%202)
318319

319320
Finding the maximum number of pointers in a block is a bit more complicated,
320321
but since our file size is limited by the integer width we use to store the
321322
size, we can solve for it. Setting the overhead of the maximum pointers equal
322323
to the block size we get the following equation. Note that a smaller block size
323324
results in more pointers, and a larger word width results in larger pointers.
324325

325-
![maximum overhead](https://latex.codecogs.com/gif.latex?B%20%3D%20%5Cfrac%7Bw%7D%7B8%7D%5Cleft%5Clceil%5Clog_2%5Cleft%28%5Cfrac%7B2%5Ew%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%29%5Cright%5Crceil)
326+
![maximum overhead](https://latex.codecogs.com/svg.latex?B%20%3D%20%5Cfrac%7Bw%7D%7B8%7D%5Cleft%5Clceil%5Clog_2%5Cleft%28%5Cfrac%7B2%5Ew%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%29%5Cright%5Crceil)
326327

327328
where:
328329
B = block size in bytes
@@ -335,8 +336,83 @@ widths:
335336

336337
Since littlefs uses a 32 bit word size, we are limited to a minimum block
337338
size of 104 bytes. This is a perfectly reasonable minimum block size, with most
338-
block sizes starting around 512 bytes. So we can avoid the additional logic
339-
needed to avoid overflowing our block's capacity in the CTZ skip-list.
339+
block sizes starting around 512 bytes. So we can avoid additional logic to
340+
avoid overflowing our block's capacity in the CTZ skip-list.
341+
342+
So, how do we store the skip-list in a directory entry? A naive approach would
343+
be to store a pointer to the head of the skip-list, the length of the file
344+
in bytes, the index of the head block in the skip-list, and the offset in the
345+
head block in bytes. However this is a lot of information, and we can observe
346+
that a file size maps to only one block index + offset pair. So it should be
347+
sufficient to store only the pointer and file size.
348+
349+
But there is one problem, calculating the block index + offset pair from a
350+
file size doesn't have an obvious implementation.
351+
352+
We can start by just writing down an equation. The first idea that comes to
353+
mind is to just use a for loop to sum together blocks until we reach our
354+
file size. We can write equation equation as a summation:
355+
356+
![summation1](https://latex.codecogs.com/svg.latex?N%20%3D%20%5Csum_i%5En%5Cleft%5BB-%5Cfrac%7Bw%7D%7B8%7D%5Cleft%28%5Ctext%7Bctz%7D%28i%29+1%5Cright%29%5Cright%5D)
357+
358+
where:
359+
B = block size in bytes
360+
w = word width in bits
361+
n = block index in skip-list
362+
N = file size in bytes
363+
364+
And this works quite well, but is not trivial to calculate. This equation
365+
requires O(n) to compute, which brings the entire runtime of reading a file
366+
to O(n^2logn). Fortunately, the additional O(n) does not need to touch disk,
367+
so it is not completely unreasonable. But if we could solve this equation into
368+
a form that is easily computable, we can avoid a big slowdown.
369+
370+
Unfortunately, the summation of the CTZ instruction presents a big challenge.
371+
How would you even begin to reason about integrating a bitwise instruction?
372+
Fortunately, there is a powerful tool I've found useful in these situations:
373+
The [On-Line Encyclopedia of Integer Sequences (OEIS)](https://oeis.org/).
374+
If we work out the first couple of values in our summation, we find that CTZ
375+
maps to [A001511](https://oeis.org/A001511), and its partial summation maps
376+
to [A005187](https://oeis.org/A005187), and surprisingly, both of these
377+
sequences have relatively trivial equations! This leads us to the completely
378+
unintuitive property:
379+
380+
![mindblown](https://latex.codecogs.com/svg.latex?%5Csum_i%5En%5Cleft%28%5Ctext%7Bctz%7D%28i%29+1%5Cright%29%20%3D%202n-%5Ctext%7Bpopcount%7D%28n%29)
381+
382+
where:
383+
ctz(i) = the number of trailing bits that are 0 in i
384+
popcount(i) = the number of bits that are 1 in i
385+
386+
I find it bewildering that these two seemingly unrelated bitwise instructions
387+
are related by this property. But if we start to disect this equation we can
388+
see that it does hold. As n approaches infinity, we do end up with an average
389+
overhead of 2 pointers as we find earlier. And popcount seems to handle the
390+
error from this average as it accumulates in the CTZ skip-list.
391+
392+
Now we can substitute into the original equation to get a trivial equation
393+
for a file size:
394+
395+
![summation2](https://latex.codecogs.com/svg.latex?N%20%3D%20Bn%20-%20%5Cfrac%7Bw%7D%7B8%7D%5Cleft%282n-%5Ctext%7Bpopcount%7D%28n%29%5Cright%29)
396+
397+
Unfortunately, we're not quite done. The popcount function is non-injective,
398+
so we can only find the file size from the block index, not the other way
399+
around. However, we can solve for an n' block index that is greater than n
400+
with an error bounded by the range of the popcount function. We can then
401+
repeatedly substitute this n' into the original equation until the error
402+
is smaller than the integer division. As it turns out, we only need to
403+
perform this substitution once. Now we directly calculate our block index:
404+
405+
![formulaforn](https://latex.codecogs.com/svg.latex?n%20%3D%20%5Cleft%5Clfloor%5Cfrac%7BN-%5Cfrac%7Bw%7D%7B8%7D%5Cleft%28%5Ctext%7Bpopcount%7D%5Cleft%28%5Cfrac%7BN%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D-1%5Cright%29+2%5Cright%29%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%5Crfloor)
406+
407+
Now that we have our block index n, we can just plug it back into the above
408+
equation to find the offset. However, we do need to rearrange the equation
409+
a bit to avoid integer overflow:
410+
411+
![formulaforoff](https://latex.codecogs.com/svg.latex?%5Cmathit%7Boff%7D%20%3D%20N%20-%20%5Cleft%28B-2%5Cfrac%7Bw%7D%7B8%7D%5Cright%29n%20-%20%5Cfrac%7Bw%7D%7B8%7D%5Ctext%7Bpopcount%7D%28n%29)
412+
413+
The solution involves quite a bit of math, but computers are very good at math.
414+
We can now solve for the block index + offset while only needed to store the
415+
file size in O(1).
340416

341417
Here is what it might look like to update a file stored with a CTZ skip-list:
342418
```
@@ -1129,7 +1205,7 @@ So, to summarize:
11291205
metadata block is active
11301206
4. Directory blocks contain either references to other directories or files
11311207
5. Files are represented by copy-on-write CTZ skip-lists which support O(1)
1132-
append and O(n logn) reading
1208+
append and O(nlogn) reading
11331209
6. Blocks are allocated by scanning the filesystem for used blocks in a
11341210
fixed-size lookahead region is that stored in a bit-vector
11351211
7. To facilitate scanning the filesystem, all directories are part of a

littlefs/lfs.c

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1004,19 +1004,20 @@ int lfs_dir_rewind(lfs_t *lfs, lfs_dir_t *dir) {
10041004

10051005

10061006
/// File index list operations ///
1007-
static int lfs_index(lfs_t *lfs, lfs_off_t *off) {
1008-
lfs_off_t i = 0;
1009-
1010-
while (*off >= lfs->cfg->block_size) {
1011-
i += 1;
1012-
*off -= lfs->cfg->block_size;
1013-
*off += 4*(lfs_ctz(i) + 1);
1007+
static int lfs_ctz_index(lfs_t *lfs, lfs_off_t *off) {
1008+
lfs_off_t size = *off;
1009+
lfs_off_t b = lfs->cfg->block_size - 2*4;
1010+
lfs_off_t i = size / b;
1011+
if (i == 0) {
1012+
return 0;
10141013
}
10151014

1015+
i = (size - 4*(lfs_popc(i-1)+2)) / b;
1016+
*off = size - b*i - 4*lfs_popc(i);
10161017
return i;
10171018
}
10181019

1019-
static int lfs_index_find(lfs_t *lfs,
1020+
static int lfs_ctz_find(lfs_t *lfs,
10201021
lfs_cache_t *rcache, const lfs_cache_t *pcache,
10211022
lfs_block_t head, lfs_size_t size,
10221023
lfs_size_t pos, lfs_block_t *block, lfs_off_t *off) {
@@ -1026,8 +1027,8 @@ static int lfs_index_find(lfs_t *lfs,
10261027
return 0;
10271028
}
10281029

1029-
lfs_off_t current = lfs_index(lfs, &(lfs_off_t){size-1});
1030-
lfs_off_t target = lfs_index(lfs, &pos);
1030+
lfs_off_t current = lfs_ctz_index(lfs, &(lfs_off_t){size-1});
1031+
lfs_off_t target = lfs_ctz_index(lfs, &pos);
10311032

10321033
while (current > target) {
10331034
lfs_size_t skip = lfs_min(
@@ -1048,7 +1049,7 @@ static int lfs_index_find(lfs_t *lfs,
10481049
return 0;
10491050
}
10501051

1051-
static int lfs_index_extend(lfs_t *lfs,
1052+
static int lfs_ctz_extend(lfs_t *lfs,
10521053
lfs_cache_t *rcache, lfs_cache_t *pcache,
10531054
lfs_block_t head, lfs_size_t size,
10541055
lfs_off_t *block, lfs_block_t *off) {
@@ -1075,7 +1076,7 @@ static int lfs_index_extend(lfs_t *lfs,
10751076
}
10761077

10771078
size -= 1;
1078-
lfs_off_t index = lfs_index(lfs, &size);
1079+
lfs_off_t index = lfs_ctz_index(lfs, &size);
10791080
size += 1;
10801081

10811082
// just copy out the last block if it is incomplete
@@ -1139,15 +1140,15 @@ static int lfs_index_extend(lfs_t *lfs,
11391140
}
11401141
}
11411142

1142-
static int lfs_index_traverse(lfs_t *lfs,
1143+
static int lfs_ctz_traverse(lfs_t *lfs,
11431144
lfs_cache_t *rcache, const lfs_cache_t *pcache,
11441145
lfs_block_t head, lfs_size_t size,
11451146
int (*cb)(void*, lfs_block_t), void *data) {
11461147
if (size == 0) {
11471148
return 0;
11481149
}
11491150

1150-
lfs_off_t index = lfs_index(lfs, &(lfs_off_t){size-1});
1151+
lfs_off_t index = lfs_ctz_index(lfs, &(lfs_off_t){size-1});
11511152

11521153
while (true) {
11531154
int err = cb(data, head);
@@ -1459,7 +1460,7 @@ lfs_ssize_t lfs_file_read(lfs_t *lfs, lfs_file_t *file,
14591460
// check if we need a new block
14601461
if (!(file->flags & LFS_F_READING) ||
14611462
file->off == lfs->cfg->block_size) {
1462-
int err = lfs_index_find(lfs, &file->cache, NULL,
1463+
int err = lfs_ctz_find(lfs, &file->cache, NULL,
14631464
file->head, file->size,
14641465
file->pos, &file->block, &file->off);
14651466
if (err) {
@@ -1526,7 +1527,7 @@ lfs_ssize_t lfs_file_write(lfs_t *lfs, lfs_file_t *file,
15261527
file->off == lfs->cfg->block_size) {
15271528
if (!(file->flags & LFS_F_WRITING) && file->pos > 0) {
15281529
// find out which block we're extending from
1529-
int err = lfs_index_find(lfs, &file->cache, NULL,
1530+
int err = lfs_ctz_find(lfs, &file->cache, NULL,
15301531
file->head, file->size,
15311532
file->pos-1, &file->block, &file->off);
15321533
if (err) {
@@ -1539,7 +1540,7 @@ lfs_ssize_t lfs_file_write(lfs_t *lfs, lfs_file_t *file,
15391540

15401541
// extend file with new blocks
15411542
lfs_alloc_ack(lfs);
1542-
int err = lfs_index_extend(lfs, &lfs->rcache, &file->cache,
1543+
int err = lfs_ctz_extend(lfs, &lfs->rcache, &file->cache,
15431544
file->block, file->pos,
15441545
&file->block, &file->off);
15451546
if (err) {
@@ -2074,7 +2075,7 @@ int lfs_traverse(lfs_t *lfs, int (*cb)(void*, lfs_block_t), void *data) {
20742075

20752076
dir.off += lfs_entry_size(&entry);
20762077
if ((0x70 & entry.d.type) == (0x70 & LFS_TYPE_REG)) {
2077-
int err = lfs_index_traverse(lfs, &lfs->rcache, NULL,
2078+
int err = lfs_ctz_traverse(lfs, &lfs->rcache, NULL,
20782079
entry.d.u.file.head, entry.d.u.file.size, cb, data);
20792080
if (err) {
20802081
return err;
@@ -2093,15 +2094,15 @@ int lfs_traverse(lfs_t *lfs, int (*cb)(void*, lfs_block_t), void *data) {
20932094
// iterate over any open files
20942095
for (lfs_file_t *f = lfs->files; f; f = f->next) {
20952096
if (f->flags & LFS_F_DIRTY) {
2096-
int err = lfs_index_traverse(lfs, &lfs->rcache, &f->cache,
2097+
int err = lfs_ctz_traverse(lfs, &lfs->rcache, &f->cache,
20972098
f->head, f->size, cb, data);
20982099
if (err) {
20992100
return err;
21002101
}
21012102
}
21022103

21032104
if (f->flags & LFS_F_WRITING) {
2104-
int err = lfs_index_traverse(lfs, &lfs->rcache, &f->cache,
2105+
int err = lfs_ctz_traverse(lfs, &lfs->rcache, &f->cache,
21052106
f->block, f->pos, cb, data);
21062107
if (err) {
21072108
return err;

littlefs/lfs_util.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,10 @@ static inline uint32_t lfs_npw2(uint32_t a) {
5252
#endif
5353
}
5454

55+
static inline uint32_t lfs_popc(uint32_t a) {
56+
return __builtin_popcount(a);
57+
}
58+
5559
static inline int lfs_scmp(uint32_t a, uint32_t b) {
5660
return (int)(unsigned)(a - b);
5761
}

0 commit comments

Comments
 (0)