Skip to content

libext: optimize data block reads and writes #1377

@wkozaczuk

Description

@wkozaczuk

The commit 76ad946 greatly improved the speed of reading and writing metadata blocks, where, for example, the i-node tables are stored. It did so by enabling the write-back cache,bcache, implemented by the lwext4 library.

Reading and writing regular data blocks, or simply reading and writing files' content, involves direct access to the block device without any cache and can be, in some workflows, dramatically slow. For example, executing one of the java unit tests, io.osv.TestDomainPermissions, reveals that it takes 5-6 times longer to run the test on the ext image compared to zfs one (btw this can only be observed when running on qemu with cache=none,aio=native and normally unit tests run with test.py execute with cache=unsafe,aio=thread). After capturing the block device strategy tracepoints, we see almost 7K of those on ext compared to ~500 on zfs.

Looking more closely, reveals many 4K reads, most likely triggered by page faults when loading memory-mapped ELF files:

0x0000400001f50040 >/usr/lib/jvm/j  1          3.781554322   0.009 virtio_blk_strategy  cmd=1, offset=110c9c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.782283354   0.009 virtio_blk_strategy  cmd=1, offset=110cac00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.783093732   0.011 virtio_blk_strategy  cmd=1, offset=110cbc00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.783909252   0.011 virtio_blk_strategy  cmd=1, offset=110ccc00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.784708002   0.009 virtio_blk_strategy  cmd=1, offset=111edc00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.785320493   0.009 virtio_blk_strategy  cmd=1, offset=110c3c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.786140638   0.010 virtio_blk_strategy  cmd=1, offset=111e6c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.786949947   0.010 virtio_blk_strategy  cmd=1, offset=110c4c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.787837132   0.011 virtio_blk_strategy  cmd=1, offset=111e7c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.788530242   0.009 virtio_blk_strategy  cmd=1, offset=111e8c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.789066929   0.009 virtio_blk_strategy  cmd=1, offset=110c5c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.789898191   0.011 virtio_blk_strategy  cmd=1, offset=111e9c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.790595935   0.009 virtio_blk_strategy  cmd=1, offset=111eac00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.791300200   0.008 virtio_blk_strategy  cmd=1, offset=110c6c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.791955380   0.008 virtio_blk_strategy  cmd=1, offset=110c7c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.792695495   0.008 virtio_blk_strategy  cmd=1, offset=110bcc00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.793365419   0.008 virtio_blk_strategy  cmd=1, offset=110bdc00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.794072225   0.008 virtio_blk_strategy  cmd=1, offset=110b4c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.794635442   0.008 virtio_blk_strategy  cmd=1, offset=110bac00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.795311067   0.008 virtio_blk_strategy  cmd=1, offset=110b9c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.796014469   0.008 virtio_blk_strategy  cmd=1, offset=110b7c00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.796838057   0.010 virtio_blk_strategy  cmd=1, offset=110bfc00, bcount=1000
0x0000400001f50040 >/usr/lib/jvm/j  1          3.797541816   0.009 virtio_blk_strategy  cmd=1, offset=110b6c00, bcount=1000

Clearly, libext would benefit from a read-ahead, maybe write-back (or write-through) cache. Ideally integrated with the core/pagecache.cc.

The zfs is a much better alternative in this case, but it has some drawbacks (large, many threads, slow to boot, not as user-friendly as ext on most Linux distributions - read more in this Wiki page). So it would be nice to improve libext in this aspect.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions