Various tmt + testing improvements #1790

cgwalters · 2025-11-19T23:43:57Z

See individual patches.

gemini-code-assist

Code Review

This pull request introduces two valuable cleanups. First, it refactors the tmt test execution to directly discover and run tests instead of plans, which simplifies the configuration and removes duplication. This is a great improvement for maintainability. Second, it adds a command to show the tmt report on failure, which will significantly aid in debugging failing tests. I've noted a few minor inconsistencies in the updated logging and error messages where 'plan' was not fully replaced with 'test'. Please see my specific comments for details.

crates/xtask/src/xtask.rs

henrywang · 2025-11-20T10:06:18Z

The failure comes from test-26-examples-build in aarch64. This test has not been included into tmt plan before, so it only runs on github runner, but not on packit. I'm working on it to fix the issue.

jeckersb · 2025-11-20T15:21:59Z

xtask: Run tmt report on failure

Why tmt doesn't do this by default, I have no idea. It's also baffling that it takes three levels of verbose i.e. -vvv to get the actual test error.

This tmt behavior also drives me crazy. Glad to see this change 👍

cgwalters · 2025-11-20T15:49:38Z

tmt/tests/examples/bootc-uki/Containerfile.stage2

        --json pretty \
        --output "/boot/$kver.efi"
+    # To support multi arch
+    systemd_boot_file=$(ls /usr/lib/systemd/boot/efi/*.efi | xargs -n 1 basename)


Right though we're going to need the same fix in Dockerfile.cfsuki. Which I think again is a driving use case for having a higher level wrapper for this.

cgwalters · 2025-11-20T16:00:56Z

Strange, I don't understand why the soft reboot test is failing with testing-farm here but works in previous runs.

henrywang · 2025-11-20T16:19:56Z

Strange, I don't understand why the soft reboot test is failing with testing-farm here but works in previous runs.

Already worked on it. Not figure out the root cause in TF. Will debug it in pure TF to see what happens.

cgwalters · 2025-11-21T13:40:42Z

Wait a minute...does testing-farm+tmt run all tests from a single plan on the same provisioned environment? It's sure looking like that's the case, see e.g. this log from the run here.

It's quite impressive actually that it's just that test that fails because some of these tests are going to leak state.

Yeah this seems to be the case. OK so this PR needs a rework.

cgwalters · 2025-11-21T13:45:34Z

because some of these tests are going to leak state.

(Of course, we could optimize this if we relied on a factory reset; I'm sure we could do that for most tests)

cgwalters · 2025-11-21T14:24:54Z

I've come to the reluctant conclusion there's no sane option to fix the "triple test definition" problem we have other than going to generating the tmt stuff. Working on that.

cgwalters · 2025-11-21T19:05:34Z

One thing I will say is I am so glad that AI exists to learn things like fmf for me

Right now touching e.g. xtask.rs causes a build phase when it shouldn't, this helps bypassing that. Signed-off-by: Colin Walters <[email protected]>

We need to run most of our tests in a separate provisioned machine, which means it needs an individual plan. And then we need a test for that plan. And then we need the *actual test code*. This "triplication" is a huge annoying pain. TMT is soooo complicated, yet as far as I can tell it doesn't offer us any tools to solve this. So we'll do it here, cut over to generating the TMT stuff from metadata defined in the test file. Hence adding a test is just: - Write a new tests/booted/foo.nu - `cargo xtask update-generated` Signed-off-by: Colin Walters <[email protected]>

…tion Move TMT test runner code from xtask.rs to tmt module: - `run_tmt()` and `tmt_provision()` functions - Helper functions for VM management and SSH connectivity - Related constants Also refactor `update_integration()` to use serde_yaml::Value for building YAML structures instead of string concatenation. Add detailed error reporting for failed TMT tests: - Assign run IDs using `tmt run --id` - Display verbose reports with `tmt run -i {id} report -vvv` Assisted-by: Claude Code (Sonnet 4.5) Signed-off-by: Colin Walters <[email protected]>

Otherwise we compile many dependencies twice unnecessarily. Signed-off-by: Colin Walters <[email protected]>

To make it easier to do upgrade tests. Signed-off-by: Colin Walters <[email protected]>

This ensures it all can work much more elegantly/naturally with sealed UKI builds - we don't want to do the build-on-target thing. Signed-off-by: Colin Walters <[email protected]>

Signed-off-by: Colin Walters <[email protected]>

cgwalters · 2025-11-21T20:47:26Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces significant improvements to the testing infrastructure, primarily by refactoring the tmt test execution into a dedicated xtask module. The changes make the test setup more robust and extensible, for example by adding support for synthetic upgrade tests.

My review focuses on a few key areas:

A critical issue in the newly added tmt.rs file where an outdated rand crate API is used, which will prevent compilation.
A couple of medium-severity suggestions to improve maintainability and clarity in the new test scripts and xtask code. One is about an inconsistency in a Dockerfile, and the other suggests refactoring a large loop for better readability and error handling.

Overall, the changes are a great step forward for the project's testing capabilities. Addressing the identified issues will make the new testing code even better.

gemini-code-assist · 2025-11-21T20:49:51Z

crates/xtask/src/tmt.rs

+fn generate_random_suffix() -> String {
+    let mut rng = rand::rng();
+    const CHARSET: &[u8] = b"abcdefghijklmnopqrstuvwxyz0123456789";
+    (0..8)
+        .map(|_| {
+            let idx = rng.random_range(0..CHARSET.len());
+            CHARSET[idx] as char
+        })
+        .collect()
+}


The rand::rng() function and random_range method are from an older version of the rand crate. With rand version 0.9+, you should use rand::thread_rng() to get a thread-local random number generator, and the gen_range() method to generate a value in a given range. This change is necessary for the code to compile and aligns with current best practices for the rand crate.

Suggested change

fn generate_random_suffix() -> String {

let mut rng = rand::rng();

const CHARSET: &[u8] = b"abcdefghijklmnopqrstuvwxyz0123456789";

(0..8)

.map(|_| {

let idx = rng.random_range(0..CHARSET.len());

CHARSET[idx] as char

})

.collect()

}

fn generate_random_suffix() -> String {

let mut rng = rand::thread_rng();

const CHARSET: &[u8] = b"abcdefghijklmnopqrstuvwxyz0123456789";

(0..8)

.map(|_| {

let idx = rng.gen_range(0..CHARSET.len());

CHARSET[idx] as char

})

.collect()

}

gemini-code-assist · 2025-11-21T20:49:52Z

crates/xtask/src/tmt.rs

+    for plan in plans {
+        let plan_name = sanitize_plan_name(plan);
+        let vm_name = format!("bootc-tmt-{}-{}", random_suffix, plan_name);
+
+        println!("\n========================================");
+        println!("Running plan: {}", plan);
+        println!("VM name: {}", vm_name);
+        println!("========================================\n");
+
+        // Launch VM with bcvk
+
+        let launch_result = cmd!(
+            sh,
+            "bcvk libvirt run --name {vm_name} --detach {COMMON_INST_ARGS...} {image}"
+        )
+        .run()
+        .context("Launching VM with bcvk");
+
+        if let Err(e) = launch_result {
+            eprintln!("Failed to launch VM for plan {}: {:#}", plan, e);
+            all_passed = false;
+            test_results.push((plan.to_string(), false, None));
+            continue;
+        }
+
+        // Ensure VM cleanup happens even on error (unless --preserve-vm is set)
+        let cleanup_vm = || {
+            if preserve_vm {
+                return;
+            }
+            if let Err(e) = cmd!(sh, "bcvk libvirt rm --stop --force {vm_name}")
+                .ignore_stderr()
+                .ignore_status()
+                .run()
+            {
+                eprintln!("Warning: Failed to cleanup VM {}: {}", vm_name, e);
+            }
+        };
+
+        // Wait for VM to be ready and get SSH info
+        let vm_info = wait_for_vm_ready(sh, &vm_name);
+        let (ssh_port, ssh_key) = match vm_info {
+            Ok((port, key)) => (port, key),
+            Err(e) => {
+                eprintln!("Failed to get VM info for plan {}: {:#}", plan, e);
+                cleanup_vm();
+                all_passed = false;
+                test_results.push((plan.to_string(), false, None));
+                continue;
+            }
+        };
+
+        println!("VM ready, SSH port: {}", ssh_port);
+
+        // Save SSH private key to a temporary file
+        let key_file = tempfile::NamedTempFile::new().context("Creating temporary SSH key file");
+
+        let key_file = match key_file {
+            Ok(f) => f,
+            Err(e) => {
+                eprintln!("Failed to create SSH key file for plan {}: {:#}", plan, e);
+                cleanup_vm();
+                all_passed = false;
+                test_results.push((plan.to_string(), false, None));
+                continue;
+            }
+        };
+
+        let key_path = Utf8PathBuf::try_from(key_file.path().to_path_buf())
+            .context("Converting key path to UTF-8");
+
+        let key_path = match key_path {
+            Ok(p) => p,
+            Err(e) => {
+                eprintln!("Failed to convert key path for plan {}: {:#}", plan, e);
+                cleanup_vm();
+                all_passed = false;
+                test_results.push((plan.to_string(), false, None));
+                continue;
+            }
+        };
+
+        if let Err(e) = std::fs::write(&key_path, ssh_key) {
+            eprintln!("Failed to write SSH key for plan {}: {:#}", plan, e);
+            cleanup_vm();
+            all_passed = false;
+            test_results.push((plan.to_string(), false, None));
+            continue;
+        }
+
+        // Set proper permissions on the key file (SSH requires 0600)
+        {
+            use std::os::unix::fs::PermissionsExt;
+            let perms = std::fs::Permissions::from_mode(0o600);
+            if let Err(e) = std::fs::set_permissions(&key_path, perms) {
+                eprintln!("Failed to set key permissions for plan {}: {:#}", plan, e);
+                cleanup_vm();
+                all_passed = false;
+                test_results.push((plan.to_string(), false, None));
+                continue;
+            }
+        }
+
+        // Verify SSH connectivity
+        println!("Verifying SSH connectivity...");
+        if let Err(e) = verify_ssh_connectivity(sh, ssh_port, &key_path) {
+            eprintln!("SSH verification failed for plan {}: {:#}", plan, e);
+            cleanup_vm();
+            all_passed = false;
+            test_results.push((plan.to_string(), false, None));
+            continue;
+        }
+
+        println!("SSH connectivity verified");
+
+        let ssh_port_str = ssh_port.to_string();
+
+        // Run tmt for this specific plan using connect provisioner
+        println!("Running tmt tests for plan {}...", plan);
+
+        // Generate a unique run ID for this test
+        // Use the VM name which already contains a random suffix for uniqueness
+        let run_id = vm_name.clone();
+
+        // Run tmt for this specific plan
+        // Note: provision must come before plan for connect to work properly
+        let context = context.clone();
+        let how = ["--how=connect", "--guest=localhost", "--user=root"];
+        let env = ["TMT_SCRIPTS_DIR=/var/lib/tmt/scripts", "BCVK_EXPORT=1"]
+            .into_iter()
+            .chain(args.env.iter().map(|v| v.as_str()))
+            .flat_map(|v| ["--environment", v]);
+        let test_result = cmd!(
+            sh,
+            "tmt {context...} run --id {run_id} --all {env...} provision {how...} --port {ssh_port_str} --key {key_path} plan --name {plan}"
+        )
+        .run();
+
+        // Clean up VM regardless of test result (unless --preserve-vm is set)
+        cleanup_vm();
+
+        match test_result {
+            Ok(_) => {
+                println!("Plan {} completed successfully", plan);
+                test_results.push((plan.to_string(), true, Some(run_id)));
+            }
+            Err(e) => {
+                eprintln!("Plan {} failed: {:#}", plan, e);
+                all_passed = false;
+                test_results.push((plan.to_string(), false, Some(run_id)));
+            }
+        }
+
+        // Print VM connection details if preserving
+        if preserve_vm {
+            // Copy SSH key to a persistent location
+            let persistent_key_path = Utf8Path::new("target").join(format!("{}.ssh-key", vm_name));
+            if let Err(e) = std::fs::copy(&key_path, &persistent_key_path) {
+                eprintln!("Warning: Failed to save persistent SSH key: {}", e);
+            } else {
+                println!("\n========================================");
+                println!("VM preserved for debugging:");
+                println!("========================================");
+                println!("VM name: {}", vm_name);
+                println!("SSH port: {}", ssh_port_str);
+                println!("SSH key: {}", persistent_key_path);
+                println!("\nTo connect via SSH:");
+                println!(
+                    "  ssh -i {} -p {} -o IdentitiesOnly=yes root@localhost",
+                    persistent_key_path, ssh_port_str
+                );
+                println!("\nTo cleanup:");
+                println!("  bcvk libvirt rm --stop --force {}", vm_name);
+                println!("========================================\n");
+            }
+        }
+    }


The logic inside this for loop is quite long and contains repetitive error handling boilerplate. This makes it hard to follow the main flow of setting up and running a test plan.

Consider refactoring the body of this loop into a separate function, for example run_plan_in_vm(sh: &Shell, plan: &str, ...) -> Result<String>.

This would allow you to:

Use the ? operator for cleaner error propagation within the new function.

Manage VM cleanup automatically using a guard struct that calls cleanup_vm() on Drop. This is a common pattern in Rust for resource management (RAII).

An example of a guard struct:

struct VmGuard<'a, 'b> { sh: &'a Shell, vm_name: &'b str, preserve: bool, } impl<'a, 'b> Drop for VmGuard<'a, 'b> { fn drop(&mut self) { if self.preserve { return; } if let Err(e) = cmd!(self.sh, "bcvk libvirt rm --stop --force {vm_name}", vm_name = self.vm_name) .ignore_stderr() .ignore_status() .run() { eprintln!("Warning: Failed to cleanup VM {}: {}", self.vm_name, e); } } }

The main loop would then become much simpler, focusing on iterating through plans and reporting results.

gemini-code-assist · 2025-11-21T20:49:52Z

tmt/tests/Dockerfile.upgrade

@@ -0,0 +1,3 @@
+# Just creates a file as a new layer for a synthetic upgrade test
+FROM localhost/bootc-integration


The Justfile's _build-upgrade-image rule overrides this FROM instruction with localhost/bootc-integration-bin using podman build --from. For clarity and to avoid confusion, this FROM instruction should match what is used in the Justfile, or use a generic placeholder like scratch to indicate it's always overridden.

FROM localhost/bootc-integration-bin

cgwalters · 2025-11-21T21:24:02Z

    content: /bin/sh: error while loading shared libraries: /lib64/libc.so.6: cannot apply additional memory protection after relocation: Permission denied

Hmm, I saw that once locally but it went away and I didn't debug. Dang it...I think this is a complicated side effect of us exporting the host container storage via virtiofs - basically the host selinux policy is tripping over the virtiofs_t type. Hum...I have an idea

cgwalters · 2025-11-21T21:46:38Z

Hum...I have an idea

➡️ bootc-dev/bcvk#159

Haven't reproduced the bug yet locally though but it at least doesn't hurt

bootc-bot bot requested a review from jmarrero November 19, 2025 23:44

cgwalters force-pushed the tmt-discover branch from a5d9732 to df71cea Compare November 19, 2025 23:44

gemini-code-assist bot reviewed Nov 19, 2025

View reviewed changes

crates/xtask/src/xtask.rs Outdated Show resolved Hide resolved

crates/xtask/src/xtask.rs Outdated Show resolved Hide resolved

crates/xtask/src/xtask.rs Outdated Show resolved Hide resolved

henrywang force-pushed the tmt-discover branch 2 times, most recently from 9b18985 to 03ff8a8 Compare November 20, 2025 14:42

cgwalters commented Nov 20, 2025

View reviewed changes

cgwalters assigned henrywang Nov 20, 2025

cgwalters mentioned this pull request Nov 20, 2025

More cfs testing2 #1794

Closed

cgwalters assigned cgwalters and unassigned henrywang Nov 21, 2025

cgwalters marked this pull request as draft November 21, 2025 14:25

cgwalters force-pushed the tmt-discover branch from 03ff8a8 to 9652e72 Compare November 21, 2025 15:12

github-actions bot added the area/documentation Updates to the documentation label Nov 21, 2025

cgwalters force-pushed the tmt-discover branch from 9652e72 to 5aab4f5 Compare November 21, 2025 18:27

cgwalters force-pushed the tmt-discover branch from baf1550 to 95c8140 Compare November 21, 2025 19:06

cgwalters added 7 commits November 21, 2025 15:23

Justfile: Add test-tmt-nobuild

b1972f0

Right now touching e.g. xtask.rs causes a build phase when it shouldn't, this helps bypassing that. Signed-off-by: Colin Walters <[email protected]>

build-sys: Run make manpages in release mode too

ad4f7cc

Otherwise we compile many dependencies twice unnecessarily. Signed-off-by: Colin Walters <[email protected]>

tests: Use --bind-storage-ro with bcvk

c7eeb0a

To make it easier to do upgrade tests. Signed-off-by: Colin Walters <[email protected]>

tests: Change the upgrade test to support fetching from host

96553df

This ensures it all can work much more elegantly/naturally with sealed UKI builds - we don't want to do the build-on-target thing. Signed-off-by: Colin Walters <[email protected]>

ci: Expand composefs testing to include upgrade

ec438ac

Signed-off-by: Colin Walters <[email protected]>

cgwalters force-pushed the tmt-discover branch from 95c8140 to ec438ac Compare November 21, 2025 20:29

cgwalters changed the title ~~Two tmt cleanups~~ Various tmt + testing improvements Nov 21, 2025

cgwalters marked this pull request as ready for review November 21, 2025 20:43

gemini-code-assist bot reviewed Nov 21, 2025

View reviewed changes

		@@ -0,0 +1,3 @@
		# Just creates a file as a new layer for a synthetic upgrade test
		FROM localhost/bootc-integration

Various tmt + testing improvements #1790

Are you sure you want to change the base?

Various tmt + testing improvements #1790

Uh oh!

Conversation

cgwalters commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henrywang commented Nov 20, 2025

Uh oh!

jeckersb commented Nov 20, 2025

Uh oh!

cgwalters Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Nov 20, 2025

Uh oh!

henrywang commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

cgwalters commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cgwalters commented Nov 19, 2025 •

edited

Loading

henrywang commented Nov 20, 2025 •

edited

Loading