Conversation
|
ghaf and ghaf-installer images for lenovo-x1 don't build |
|
Now ghaf-installer built successfully and I was able give this a try.
Defaulting to Nixos Yarara 26.05 There was some error in the boot logs but after waiting some time boot continued. Dim Ghaf splash screen stayed a long time (~1min) on the screen. Eventually Ghaf User Provisioning menu appeared with only "Join Active Directory domain" and "Exit provisioning" options available. (I had created a local user before A/B update test.) Seems that the update has changed user configuration to ghaf/modules/reference/profiles/mvp-user-trial.nix Connecting AD server does not work without changing dns IP (known issue). Exited provisioning menu but ghaf login screen didn't appear (because there are no users), only black screen and after a while ssh connection to net-vm still worked. App VMs failed to boot.
Requested outputs for debugging: |
|
Note: the branch is 22 commits behind main |
|
After reboot and selecting plain NixOS at the boot menu it boots fine to the original version and |
3d87b94 to
522aa5c
Compare
522aa5c to
5dd49cc
Compare
|
Tested again on lenovo-x1. Summary
Also I am wondering if ids-vm is included purposely in the update? That is normally disabled in ghaf by default. |
|
More detailed notes of this test run |
|
Checked also with encrypted installation ( |
5dd49cc to
656376f
Compare
| # Ghaf Inter VM communication and control library | ||
| givc = { | ||
| url = "github:tiiuae/ghaf-givc"; | ||
| url = "github:avnik/ghaf-givc?ref=avnik/ab-update"; |
There was a problem hiding this comment.
Can we merge ghaf-givc dependencies first?
There was a problem hiding this comment.
That a plan. It 99% ready
| sed -i \ | ||
| "0,/${roothashPlaceholder}/ s/${roothashPlaceholder}/$verityRoothash/" \ | ||
| ${kernelImage} |
There was a problem hiding this comment.
Using sed -i on .raw image can be inefficient because it creates a full temporary copy of the file.
Recommendation: Use a Python script with mmap or a dd-based approach to perform an in-place replacement of the 64-character hash string
There was a problem hiding this comment.
Not sure about this statement without a benchmark. The kernel image isn't that big to begin with and starting python also has a cost and is in the order of 40x slower than native code.
There was a problem hiding this comment.
If Python isn’t an option, we can explore using the dd command. If the file size is not that big, using sed is also fine.
There was a problem hiding this comment.
Well, in reality it pretty fast.
But I didn't like this approach anyway, I'd prefer switch to direct call of ukify/sbsing into.
| cp ${config.system.build.uki}/${config.system.boot.loader.ukiFile} ${kernelImage} | ||
|
|
||
| # Replace the placeholder with the real roothash in the target .raw file | ||
| verityRoothash=$(cat $out/dm-verity-root-hash) |
There was a problem hiding this comment.
Do we need to verify whether verityRoothash can be empty?
| verity_0 = { | ||
| size = "6G"; | ||
| }; |
There was a problem hiding this comment.
A standard dm-verity hash tree size allocation is generally small, typically requiring approximately 0.8% to 1% of the total size of the protected partition.
Protecting a 10GB partition often requires only about 81MB of additional space for the hash tree.
Any specific need to have size of 6G?
There was a problem hiding this comment.
Agreed. This is quiet large.
There was a problem hiding this comment.
My readings/measurements show 8-10%
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Host Configuration: Added an entry to systemd.tmpfiles.rules in modules/microvm/host/microvm-host.nix to ensure the /persist/sysupdate directory is created on the host with 0755 permissions owned by root.
modules/microvm/host/microvm-host.nix
"d /persist/sysupdate 0755 root root -"
NetVM Configuration: Added a share configuration in modules/microvm/sysvms/netvm.nix to mount the host's /persist/sysupdate to /persist/sysupdate inside the netvm using virtiofs.
modules/microvm/sysvms/netvm.nix
{
tag = "sysupdate";
source = "/persist/sysupdate";
mountPoint = "/persist/sysupdate";
proto = "virtiofs";
}
Signed-off-by: vadik likholetov <vadikas@gmail.com>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
…jection Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
| import os | ||
|
|
||
|
|
||
| def fixname(filename, version, fragment): |
There was a problem hiding this comment.
Adding types here to all functions, would be useful for future refactorings.
| def sha256_file(path: str) -> str: | ||
| h = hashlib.sha256() | ||
| with open(path, "rb") as f: | ||
| for chunk in iter(lambda: f.read(1024 * 1024), b""): |
There was a problem hiding this comment.
Nitpick For multiple gigabyte files it might be worth using sha256sum instead of python. But we should probably measure the time of creating the manifest quick to see if this optimization is worth wile.
656376f to
78ce267
Compare
|
Tested again on Lenovo-X1
|
|
This PR caused build of multiple 128GB images in automated pre-merge testing, prod agent got stuck with disk full yesterday because of this. |
Maybe we need to switch to runInLinuxVM? One option that came to my mind, is using qcow2 or some other sparse format. We also don't need to create so large partitions upfront. |
Signed-off-by: Alexander Nikolaev <alexander.nikolaev@tii.ae>
|
I now create the b partitions / swap / persist on first boot: Mic92@cf31f6a this saves a lot of memory. Also the root partition is now compressed with lz4: Mic92@cabfbbb |
Description of Changes
What in this PR:
ota-updatetool (branch in GIVC, 100% done, TODO: finish/debug/improve UX)ota-updateand generator finally mergedNOTE: flake.lock at the moment locked on avnik/ghaf and avnik/ab-update for generator and givc
Known issues: (checked is fixec)
What out of scope of this PR:
ghaf@netvm:/persist/sysupdate)Type of Change
Related Issues / Tickets
Checklist
make-checksand it passesTesting Instructions
nix build -L ".#lenovo-x1-gen11-sysupdate-debug" --show-tracescp ./result/* ghaf@carbon:/persist/sysupdatesudo ota-update image status, first slot should be used, legacy and active, second slot -- empty and legacysudo ota-update image --dry-run install --manifest /persist/sysupdate/....manifest(exact manifest name could vary)sudo ota-update image install --manifest /persist/sysupdate/....manifestsudo ota-update image statussudo ota-update image status-- second slot should be marked as bothusedandactive(and not legacy).versionfile in ghaf source tree, add ".0" to it, repeat steps 2-8 with it.N. All other behavior should be unchanged
On problems since 4th step -- please collect output from:
sudo bootctl list --json=prettysudo -E LC_ALL=C lvs --all --report-format json --units B --no-suffixsudo ota-update image status(if it works of course)Applicable Targets
aarch64aarch64x86_64x86_64x86_64Installation Method
nixos-rebuild ... switchTest Steps To Verify: