Skip to content

Problematic mount order of ZRAM #2060

@hundertvolt1

Description

@hundertvolt1

In some investigations about SD card writes and how ZRAM is mounted wrt. other mounts and services (details see this forum discussion from here), I found that there are two pending and closely linked issues concerning the way it is currently mounted:

<Issue 1> Wrong order configured in /etc/systemd/system/'srv-openhab\x2d<dirname>.mount.
Current state:

Before=smbd.service
After=network.target zram-config.service

This order intends to first mount the ZRAM overlays and then the bind-mount to the /srv/openhab-x directories, which leads to the bind-mount showing the contents of the physical, SD card mounted underlying directory instead of the live contents of the ZRAM overlay. ZRAM mounted folders inside /srv/openhab-x then show potentially very old data from the last graceful shutdown of the ZRAM service.

How to reproduce:
Easiest and cleanest way is to enforce the mount order manually (during boot-up it's critical due to race conditions, see <Issue 2> below!).
Stop all services:

sudo systemctl stop openhab  # just to be sure
sudo systemctl stop zram-config.service
sudo systemctl stop "srv-openhab\\x2duserdata.mount"

Start the mounts in the order "First ZRAM, then bind-mount":

sudo systemctl start zram-config.service
sudo systemctl start "srv-openhab\\x2duserdata.mount"

Then, look at the mount table - you'll get a single ZRAM overlay mount for each folder:

mount | grep overlay
overlay1 on /var/lib/openhab/foo type overlay (rw,relatime,lowerdir=/opt/zram/foo.bind,upperdir=/opt/zram/zram1/upper,workdir=/opt/zram/zram1/workdir,redirect_dir=on,uuid=on)
overlay2 on /var/lib/openhab/persistence type overlay (rw,relatime,lowerdir=/opt/zram/persistence.bind,upperdir=/opt/zram/zram2/upper,workdir=/opt/zram/zram2/workdir,redirect_dir=on,uuid=on)

You can as well check the contents of the folders in the bind mount and at the origin. Bind mount will be older than origin and df -a will give you a "mmcblk...", while the actual ZRAM mount will be recent and will give you an "overlay...".

Desired state:
The /srv/openhab-x directories should contain live data and be exactly in the same state as their targets (e.g. /var/lib/openhab/persistence).

Suggested solution:
Start the bind mount first, then start ZRAM. zram-config.service is able to follow the bind mount and apply an additional mount of the overlay there as well. Modify /etc/systemd/system/'srv-openhab\x2d<dirname>.mount as such:

Before=smbd.service zram-config.service
After=network.target 

Also easy to manually reproduce; follow the steps in "How to reproduce", but exchange the order of the two sudo systemctl start commands. You'll see double ZRAM mounts for each folder:

overlay1 on /var/lib/openhab/foo type overlay (rw,relatime,lowerdir=/opt/zram/foo.bind,upperdir=/opt/zram/zram1/upper,workdir=/opt/zram/zram1/workdir,redirect_dir=on,uuid=on)
overlay1 on /srv/openhab-userdata/foo type overlay (rw,relatime,lowerdir=/opt/zram/foo.bind,upperdir=/opt/zram/zram2/upper,workdir=/opt/zram/zram2/workdir,redirect_dir=on,uuid=on)

overlay2 on /var/lib/openhab/persistence type overlay (rw,relatime,lowerdir=/opt/zram/persistence.bind,upperdir=/opt/zram/zram2/upper,workdir=/opt/zram/zram2/workdir,redirect_dir=on,uuid=on)
overlay2 on /srv/openhab-userdata/persistence type overlay (rw,relatime,lowerdir=/opt/zram/persistence.bind,upperdir=/opt/zram/zram2/upper,workdir=/opt/zram/zram2/workdir,redirect_dir=on,uuid=on)

one for the target, the other for the bind mount. Listing the contents of the ZRAM'ed folders will give you equal and up-to-date content, df -a will yield an overlay filesystem in them.

At that point, everything worked well on my Raspberry Pi 4 / 8GB OpenHABian system; the Samba shares on the /srv/ folders delivered the most recent, ZRAM contained data on the remote host as well.

But this good status is not necessarily certain, because....

<Issue 2> Ordering services to start after zram-config.service in systemd is strongly unreliable.
Some Background:
When making service starts depend on a simple mount like /etc/systemd/system/'srv-openhab\x2d<dirname>.mount which is known to systemd from the very beginning of boot-up, it is for sure that this will not happen before the mount is really usable.
So making zram-config.service wait until the bind-mount is in place, as suggested in my solution above, should be perfectly safe and stable.
Not so if waiting for zram-config.service itself. It will spawn several dynamic mounts when it's started - three per ZRAM'ed folder, as it looked to me - and then declare itself to be started. This arises two issues:

  1. If you trigger further services when zram-config.service is done starting, this does not mean that the spawned mounts are ready. As a matter of fact, this starts a race condition (details see the forum post mentioned on top) of following services and ZRAM spawned mounts.
  2. As these mount processes are spawned dynamically, systemd has no clue that they may even exist when booting up. Their naming follows a canonical scheme (like var-lib-openhab-persistence.mount), but although I tried many combinations to wait on them, none of them was able to solve the race condition. I tried adding all of these mounts in the After= section. Adding them in the Requires= section on top of it. Creating a zram-mounts.target file to contain all of them. Putting them in a RequiresMountsFor= which is said to be able to catch dynamic mounts. None of them worked for me - I could see actions and other services happen when the ZRAM mounts were mid-air on my system.

Conclusion:
I tried many possible solutions and did not get lucky so far. Although Solution (1) relieves the issue for the /srv/ bind-mount by putting it before even mounting ZRAM, and my system appears to work at that point, it leaves me with a very uncertain feeling.

There are many services still waiting until zram-config.service is done, but should instead wait until all ZRAM mounts are really done. The most prominent ones I found in /etc/systemd/system/zram.service.d/override.conf are:
Before=openhab.service exim4.service nmbd.service smbd.service
They, and others I didn't see yet, are potentially prone to hit a ZRAM mount which is not yet, or partially, or fully in place - depending on who wins the race.

So at that time, I have no ready solution, but strongly suggest that we find a way to make systemd wait stable and reliably on really-ready ZRAM before starting services.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions