Skip to content

Conversation

@ckyrouac
Copy link
Collaborator

@ckyrouac ckyrouac commented Sep 5, 2025

This is a rebased version of #1389 with some fixes on top to make bootc install reset --experimental work with the latest soft reboot code. I'd like to get this in hidden behind --experimental, then iterate on it. I plan to add some e2e tests as a followup but I can work on including them as part of this PR if you want.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new experimental bootc install reset command, which appears to function as a factory reset. The implementation is well-structured, refactoring existing deployment logic to use a new MergeState enum, which is a clean way to handle different deployment sources. The changes also include new utilities for managing stateroots. I've found a couple of issues that need to be addressed before merging.

@cgwalters
Copy link
Collaborator

I think we'll want tests at least before merging, and supporting that with tmt is probably going to be an excellent driver for the feature.

I haven't dug in but I think tmt is relying on injecting ssh keys w/cloud-init, so it might Just Work to do a reset there?

@ckyrouac ckyrouac marked this pull request as draft September 8, 2025 12:52
@ckyrouac ckyrouac force-pushed the factory-reset branch 5 times, most recently from b45e272 to a0cd215 Compare September 24, 2025 13:37
@ckyrouac ckyrouac marked this pull request as ready for review September 24, 2025 15:02
@bootc-bot bootc-bot bot requested a review from jeckersb September 24, 2025 15:02
@ckyrouac
Copy link
Collaborator Author

I think we'll want tests at least before merging

Updated with a basic test.

let deployment_dir = ostree.deployment_dirpath(&staged);
let deployment_dir = std::path::Path::new(deployment_dir.as_str());
if deployment_dir.exists() {
let marker_path = deployment_dir.join(".bootc-factory-reset");
Copy link
Collaborator

@cgwalters cgwalters Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have clear const for things like this with an explanation there.

But backing up a second, in the ostree side we have "deployment backing" directories already where we can put metadata.

Though of course we should also be thinking about how this would work with the composefs backend.

Hmmm hmm. Here's an idea, how about we use the deployment's etc directory for state like this instead? I think we could save e.g. system.bootc.merged_from xattr or so?

Basically I lean towards extended attributes over "stamp files" for metadata.

BTW though do note the intersection here with etc.transient - I wonder if we should make that more visible in status output too. If that's enabled each upgrade is already like a reset of /etc.

if is_factory_reset {
// Factory reset deployments don't support soft reboots
// this is primarily because the kargs validation will fail when checking for soft reboot
// compatibility in the ostree code, which will cause bootc status and upgrade to fail.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need to special case this. Why is bootc status failing exactly?

If we're changing (usually dropping) kernel arguments won't the existing karg check work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a followup - we talked about this in a live chat and there seems to be a libostree bug here where the staged deployment is missing ostree= and that trips an assertion in the soft reboot checks.

Ideally we root cause and fix that.

But we may be able to work around this by detecting when there's no kargs in the staged deployment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will fix the factory resest issue: ostreedev/ostree#3532. Still investigating why the deployment is missing the ostree arg.

@jlebon
Copy link
Contributor

jlebon commented Oct 7, 2025

Just saw the demo for this. A few random thoughts.

It'd be nice to be able to have different reset levels. One is just resetting /etc. This doesn't require a new stateroot. Another would be resetting both /etc and /var.

Should this use stateroots at all? This would be another thing to implement in the composefs backend (or maybe that code and concept already exist there?). But also it's quite common for /var or subdirectories of /var to be mounts. I.e. the stateroot var is not used in that case. We should probably detect this and error out.

One alternative though is basically to just move everything in /var into /var/.previous. That would work in both cases, is cheap, and makes access to the previous /var trivial.

(There's still the case of submounts in /var of course. We could apply that rename strategy recursively, or just warn and leave it up to the user -- or require them to be unmounted before proceeding?)

@github-actions github-actions bot added area/install Issues related to `bootc install` area/ostree Issues related to ostree labels Oct 9, 2025
@ckyrouac ckyrouac force-pushed the factory-reset branch 4 times, most recently from 46e0668 to def648d Compare October 15, 2025 16:35
@cgwalters cgwalters mentioned this pull request Oct 17, 2025
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's some CI failures, but otherwise we could try to get this in?

It'd be a bit ugly but maybe we could detect the ostree version and avoid crashing if it's too old?

@ckyrouac
Copy link
Collaborator Author

It'd be a bit ugly but maybe we could detect the ostree version and avoid crashing if it's too old?

Just pushed up a commit to check the ostree version in the has_soft_reboot_capability function where the crash happens. I haven't had a chance to dig into the deeper root cause of why the ostree= karg is getting set but the version check should unblock this PR's CI. The result is on systems with an outdated ostree, soft reboot will be disabled for factory reset deployments. If it works for you, I'll just file an issue to track fixing the ostree= karg issue.

Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks overall sane to me!

One thing that would probably help is to add docs for this in our experimental section.

assert not equal $reset_status.status.otherDeployments.0.ostree.stateroot "default"

# we need tmt in the new stateroot for second_boot
print "Copying tmt into new stateroot"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this one is really asking for enhancing reset to support copying data in a nicer way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

working on adding bootc mount

@ckyrouac
Copy link
Collaborator Author

Ah, so the current CI failures are due to the outdated ostree in the tmt env which is causing the soft reboot tests to fail because softRebootCapable is defaulting to false. Not sure how we want to handle this. With the current code in this PR, soft reboot will be disabled for all systems with ostree < 2025.7. I can add back the code to the Dockerfile to install ostree from the continuous repo so the tests pass. Or, a more complicated workaround would be to add back the code to flag a deployment as a factory reset, then disable soft reboot only for ostree < 2025.7 and isFactoryReset. @cgwalters thoughts?

@cgwalters
Copy link
Collaborator

cgwalters commented Oct 22, 2025

Oh hmmmm. Yes a bit of a pickle. Can you see if (perhaps with agent assist) we can detect if the staged deployment is missing the ostree= karg instead and our conditional logic skips if that's the case?

or slightly more elaborated: if (ostree is new enough || !staged_missing_karg() { we're good } else { skip } or so right?

@ckyrouac ckyrouac force-pushed the factory-reset branch 6 times, most recently from 7567a20 to 68aec61 Compare October 23, 2025 19:17
cgwalters and others added 3 commits October 27, 2025 12:07
This is a nondestructive variant of `to-existing-root`.

Signed-off-by: Colin Walters <[email protected]>
Also cleans up some bugs from rebasing previous commits.

Signed-off-by: ckyrouac <[email protected]>
@ckyrouac ckyrouac force-pushed the factory-reset branch 5 times, most recently from 9f1436d to cc1b380 Compare October 28, 2025 17:50
Add ostree version check to has_soft_reboot_capability() to ensure
soft reboot is disabled when ostree < 2025.7 and ostree= karg is
missing.
This prevents attempting soft reboots on older ostree versions that
have a bug when validating kargs during a factory reset.

Signed-off-by: ckyrouac <[email protected]>
Add logic to copy the /boot mount specification from the existing
/etc/fstab to the newly created stateroot during factory reset. This
ensures that the /boot partition configuration is preserved across
the reset operation.

Includes helper function read_boot_fstab_entry() to parse /etc/fstab
and locate the /boot mount entry, along with comprehensive unit tests.

Assisted-by: Claude Code
Signed-off-by: ckyrouac <[email protected]>
@ckyrouac ckyrouac requested a review from cgwalters October 29, 2025 16:51
@ckyrouac
Copy link
Collaborator Author

@cgwalters added another commit to automatically preserve the /boot mount

@cgwalters cgwalters merged commit 0444e6a into bootc-dev:main Oct 29, 2025
36 checks passed
@cgwalters
Copy link
Collaborator

Nice! Hopefully we get some good feedback from this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/install Issues related to `bootc install` area/ostree Issues related to ostree

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants