-
Notifications
You must be signed in to change notification settings - Fork 166
Description
I am getting this from my installation:
[hkj@kagumine Linux]$ sudo bootc status
error: Status: Getting composefs deployment status: Getting composefs deployment status: Checking soft reboot capability: Setting soft reboot capability for Type1 entries: Entry not found
To reproduce:
upgradeorswitchto a new image- after reboot, enter rollback boot option
- in rollback boot,
upgradeorswitchto a third image - reboot and the system will be in this state
Context:
So I use a customized image in my local self-hosted registry with baked in configurations with this installation. What happened is that I misconfigured one revision and completely broke my network software. I made a revision that fixes the error, but without an internet connection I cannot pull the upgrade to the host. So I rebooted into the rollback deployment, and applied upgrade from there. However after rebooting into the now working new revision, bootc is completely broken and just gives me the above error message for virtually any sub-command I want to run.
Some analysis:
I think the problem is here, when status is trying to determine which deployment is rollback, it just tries to go through all deployments and match against booted and staged - then the leftover is rollback. But if you have multiple stale old deployments present, this just gives the last enumerated deployment that is not booted nor staged. On my system this apparently gets resolved to the broken image deployment whose BLS entry already got removed by the upgrade run in the rollback boot.
Funny enough, this still works under most installations if you just upgrade normally without rollbacks, since opendir (underlying syscall for Dir:read_dir) just reads the underlying directory's on fs index to get the entries and return entries in order as stored in the index. And most fs happens to implement their index as a simple linear data structure that puts new entries at the end of the enumeration. So everything just works if the order is followed, until one rollback, the last order got reversed and everything goes haywire.
We probably need to fix that rollback determination code by ensuring only deployment that has a corresponding boot entry will be recognized as rollback