Skip to content

jetson-orin: A/B verity boot with Secure Boot and OTA updates#1811

Open
Mic92 wants to merge 6 commits intomainfrom
ab-update-jetson-agx
Open

jetson-orin: A/B verity boot with Secure Boot and OTA updates#1811
Mic92 wants to merge 6 commits intomainfrom
ab-update-jetson-agx

Conversation

@Mic92
Copy link
Contributor

@Mic92 Mic92 commented Mar 9, 2026

Description

A/B verity boot end-to-end on Jetson AGX Orin, with UEFI Secure Boot and OTA update support.

Verity boot: UKI-based boot with dm-verity root filesystem on erofs (lz4hc compressed). First-boot service creates swap/persist/B-slot placeholder on the device.

Secure Boot: Cherry-picks #1713 for UEFI key enrollment, then adds flash-time EFI signing with sbsign. Private keys stay out of the Nix store — the flash script reads them from SECURE_BOOT_SIGNING_KEY_DIR at runtime (falls back to embedded dev-keys path). OTA update UKIs are signed at build time in debug builds (asserts on debugEnable); in production the OTA server signs images before distribution.

OTA updates: Enable givc on Orin with upstream ghaf-givc (PR tiiuae/ghaf-givc#372 merged). The manifest includes unpacked_size so ota-update can create and resize LVM slots on demand.

Depends on tiiuae/ghaf-givc#372 (merged)

Verified on Jetson AGX Orin devkit

Tested on physical Jetson AGX Orin devkit, cross-compiled from x86_64-linux.

  • ✅ Flash with signed EFIs — boots with Secure Boot: enabled (user)
  • ✅ First-boot creates swap (4G), persist (~39G), B-slot placeholder
  • ota-update image install writes signed UKI + root + verity to auto-created LVs
  • bootctl set-oneshot → reboot into slot B (26.03.2) — accepted by Secure Boot
  • ✅ Slot B shows active: true after reboot
  • ota-update image remove recycles old slot
  • ✅ Unsigned UKI rejected by UEFI firmware (tested with previous unsigned build)

Known limitation

OTA update UKI signing uses uki-signing-key-dir which places dev private keys in the Nix store. This is gated by an assertion to debug builds only. In production, the OTA server must sign UKIs before distribution — this integration is not yet implemented. See secureboot.mdx for details.

Testing

Build and flash

nix build .#nvidia-jetson-orin-agx-verity-debug-from-x86_64-flash-script

# Put Jetson in recovery mode (hold recovery button + press reset)
# Verify: lsusb | grep "0955:7023"
sudo ./result/bin/flash-ghaf-host

Verify boot (serial console or SSH)

bootctl status
# Expected: Secure Boot: enabled (user)

ota-update image status
# Expected: Slot A active, root ~6.2G + verity ~0.1G

Test OTA update

# Build update image (bump version first)
echo "26.03.2" > .version
nix build .#nvidia-jetson-orin-agx-verity-debug-from-x86_64-ghafImage

# Copy to device and install
scp result/* root@<jetson-ip>:/tmp/
ssh root@<jetson-ip> ota-update image install --manifest /tmp/ghaf_26.03.2_*.manifest

# Switch to slot B and reboot
ssh root@<jetson-ip> "bootctl set-oneshot ghaf-26.03.2-*.efi && reboot"

# After reboot, verify slot B is active
ota-update image status
# Expected: slot B (26.03.2) active: true

@Mic92
Copy link
Contributor Author

Mic92 commented Mar 9, 2026

Also opened tiiuae/ghaf-givc#372

@Mic92
Copy link
Contributor Author

Mic92 commented Mar 10, 2026

Depends on #1678

@Mic92 Mic92 force-pushed the ab-update-jetson-agx branch from bed36d2 to 665641e Compare March 10, 2026 10:05
@Mic92 Mic92 force-pushed the ab-update-jetson-agx branch from 665641e to 6880b3b Compare March 10, 2026 14:43
@Mic92 Mic92 force-pushed the ab-update-jetson-agx branch from 6880b3b to bc6e3c8 Compare March 23, 2026 09:51
@Mic92 Mic92 marked this pull request as ready for review March 23, 2026 09:51
@Mic92 Mic92 force-pushed the ab-update-jetson-agx branch from bc6e3c8 to eb1077b Compare March 23, 2026 09:55
@Mic92 Mic92 force-pushed the ab-update-jetson-agx branch from eb1077b to bb81111 Compare March 23, 2026 10:07
@Mic92
Copy link
Contributor Author

Mic92 commented Mar 24, 2026

Have to investigate this device:

12:45:13  + nix run .#ghaf-robot -- -v DEVICE:OrinNX1 -v DEVICE_TYPE:orin-nx -v BUILD_ID:20854 -v COMMIT_HASH:0e46b83ec5bd06b54c970e51fbb5b7a55dd080db -i orin-nxANDrelayboot .

Mic92 added 4 commits March 24, 2026 23:25
Add splice-flash-xml.py, a Python script that replaces the
<device type="sdmmc_user"> partitions in NVIDIA's flash XML with a
custom layout defined in JSON. This replaces fragile line-count based
head/tail splicing that breaks when the upstream BSP XML changes.

Convert partition-template.nix to use the new script. The partition
layout is now defined as structured Nix data serialized to JSON.
Partition sizes are injected at build time from sdImage metadata
via --set instead of flash-time sed substitution.

Features:
  --set PARTITION.FIELD=VALUE  override partition child element values
  --remove-device              remove sdmmc_user device (QSPI-only)
Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Add verity-image.nix that builds ESP and LVM partition images for A/B
verity boot on Jetson AGX Orin.

ESP contains systemd-boot + a UKI (Unified Kernel Image) that embeds
kernel, initrd, and cmdline with the dm-verity roothash. The UKI is
built WITHOUT a .dtb section because NVIDIA's EFI_DT_FIXUP_PROTOCOL
corrupts device trees loaded from memory by sd-stub. Without .dtb,
sd-stub skips devicetree_fixup() and the kernel uses the firmware's
DTB already in the EFI Configuration Table (installed by DtPlatformDxe
during UEFI boot).

LVM image contains A/B erofs root + verity hash tree slots, swap, and
a btrfs persist volume. Slot sizes are computed dynamically from actual
image sizes at build time. Images are converted to NVIDIA sparse format
via mksparse for efficient USB flashing with tegradevflash_v2.

Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Use lz4hc level 9 for erofs compression, which reduces the nix store
image from 9.37 GiB to 5.30 GiB (43% savings). With A/B partitioning
this saves ~8 GiB of flash storage. Decompression throughput for lz4
is ~1.0 GB/s on x86_64 (streaming 9.4 GiB image).

zstd would provide slightly better compression (5.09 GiB) and ~60%
faster decompression (~1.6 GB/s), but EROFS_FS_ZIP_ZSTD requires
kernel >= 6.10 (commit 7c35de4df105) while the Jetson BSP ships
kernel 6.6.

Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Move swap, persist and B-slot placeholder LV creation from the
pre-built LVM image to a first-boot systemd service. This shrinks
the flash image from ~16 GiB to ~6 GiB (only A-slot root + verity),
eliminating the sparse image conversion step.

On first boot, firstboot-persist.service:
1. Resizes the GPT/APP partition to fill the eMMC
2. Expands the LVM PV to match
3. Creates swap (4G) and persist (btrfs) LVs
4. Reserves 1.5x the A-slot size for a future B-slot

Each step is idempotent so the service can safely re-run after
a power loss. Swap uses randomEncryption for security.

Systemd ordering is handled naturally: persist.mount waits for
/dev/pool/persist to appear (device unit dependency), and the
NixOS-generated mkswap service is explicitly ordered after
firstboot-persist.

Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Mic92 added 2 commits March 25, 2026 10:29
Enable givc on Orin (was disabled) and bump ghaf-givc to include
auto-create and resize of LVM slots for OTA updates (PR #372).

Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
The secureboot PR (#1713) enrolls PK/KEK/db keys into the Jetson Orin
firmware, but nothing was signing the UKI or systemd-boot. Once keys
are enrolled and the UEFI leaves Setup Mode, it rejects unsigned
binaries with 'Access denied', bricking the device.

Move ESP image construction from a Nix derivation into the flash
script so we can sign EFI binaries with sbsign just before writing
them to the FAT partition. The private key is read at flash time
from SECURE_BOOT_SIGNING_KEY_DIR (or the signingKeyDir option),
keeping it out of the Nix store.

Add self-signed development keys under modules/secureboot/dev-keys/
for testing. These are explicitly not secret and must not be used in
production.

Tested on Jetson AGX Orin: device boots with Secure Boot enabled
(user mode), unsigned UKI is rejected with 'Access denied'.

Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
@leivos-unikie
Copy link
Contributor

I ran the steps defined in "Testing" (with small modifications) for Orin AGX. The update works as expected but noticed that nix-gc.service is failing at boot after update.
PR1811_tests_orin-agx.txt

@leivos-unikie
Copy link
Contributor

  • Flashed Orin AGX with nix build .#nvidia-jetson-orin-agx-debug-from-x86_64-flash-qspi
  • Booted non-verity image (nix build .#nvidia-jetson-orin-agx-debug-from-x86_64) from USB SSD
  • Some services are failing:
[ghaf@ghaf-host:~]$ systemctl list-units --state=failed
  UNIT                      LOAD   ACTIVE SUB    DESCRIPTION                                                                                          >
● lvm-activate-pool.service loaded failed failed [systemd-run] /nix/store/fhlw09nl44wadi43fwl6vmzsgj917gjb-lvm2-aarch64-unknown-linux-gnu-2.03.38-bin/
[ghaf@admin-vm:~]$ systemctl list-units --state=failed
  UNIT             LOAD   ACTIVE SUB    DESCRIPTION
● firewall.service loaded failed failed Firewall
[ghaf@ghaf-4082324722:~]$ systemctl list-units --state=failed
  UNIT             LOAD   ACTIVE SUB    DESCRIPTION
● firewall.service loaded failed failed Firewall
  • net-vm does not have passthrough to physical eth device or wifi device
[ghaf@ghaf-4082324722:~]$ ifconfig
ethint0   Link encap:Ethernet  HWaddr 02:AD:00:00:00:01  
          inet addr:192.168.100.1  Bcast:192.168.100.255  Mask:255.255.255.0
          inet6 addr: fe80::ad:ff:fe00:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:10120 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8041 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1366529 (1.3 MiB)  TX bytes:1533313 (1.4 MiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

(There is no physical eth on ghaf-host either.)

Are we supposed to stop using plain nvidia-jetson-orin-agx-debug-from-x86_64 images in automated testing and switch to use nvidia-jetson-orin-agx-verity-debug-from-x86_64-ghafImage (booted from USB SSD)?

@leivos-unikie leivos-unikie added the bug on Orin AGX Cross Issues found on NVIDIA Jetson AGX Orin cross-compiled while checking this PR label Mar 25, 2026
@leivos-unikie
Copy link
Contributor

Tried
nix build .#nvidia-jetson-orin-agx-verity-debug-from-x86_64-flash-qspi

It fails with

       Output paths:
         /nix/store/ijkqikfd2dm8431axcmjmx73pwqijwif-signed-orin-agx-devkit-36.5.0
       Last 12 log lines:
       > ============================================================
       > Ghaf A/B verity flash script for NVIDIA Jetson
       > ============================================================
       > Version: 26.03.1
       > SoM: orin-agx
       > Carrier board: devkit
       > ============================================================
       >
       > Building ESP image...
       > ERROR: Secure Boot is enabled but no signing keys found.
       >   Set SECURE_BOOT_SIGNING_KEY_DIR or place db.key + db.crt in:
       >   /nix/store/j7p14blvizs4rvfn8nvqjm33ghz7hg3l-source/modules/secureboot/dev-keys
       For full logs, run:
         nix log /nix/store/32wyzjl46g9axb58avqp7a5lp77mdjms-signed-orin-agx-devkit-36.5.0.drv

I assume this is expected at this stage because "uki-signing-key-dir places dev private keys in the Nix store"

@leivos-unikie
Copy link
Contributor

leivos-unikie commented Mar 25, 2026

Previous checks I made with

[leivos@nixos:~/repos/clean_ghaf/ghaf]$ git log
commit 75a8047211fb2b56fc0fa43d2e1f567150dc5c34 (HEAD -> ab-update-jetson-agx)
Author: Jörg Thalheim <joerg@thalheim.io>
Date:   Mon Mar 9 17:12:42 2026 +0100

I just noticed that Jörg has been pushing at the same time I have been testing.

@leivos-unikie leivos-unikie added the question Further information is requested label Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug on Orin AGX Cross Issues found on NVIDIA Jetson AGX Orin cross-compiled while checking this PR Needs Testing CI Team to pre-verify question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants