You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug 1744245: recreate of sub should not fail install
Back to back delete and recreate of a subscription object causes
operator install to fail.
How to reproduce:
- Create a CatalogSource object
- Create a subscription that refers to the CatalogSource above.
- Wait for the operator to install successfully.
- Update the CatalogSource
- Wait for the CatalogSource to become healthy
- Delete the Subscription object ( from above ).
- Create the Subscription object ( no time delay between delete
and create ). Delete and Create can be done one after another,
there is no need to make them concurrent.
The operator install will fail, Subscription status will have an error
condition `ReferencedInstallPlanNotFound`. The new install plan object
created by OLM gets deleted by GC.
Root cause:
- OLM uses a lister to get the list of Subscription(s) in a given
namespace and sets the relevant subscriptions(s) found in the list as
owner of the installplan object(s).
- Because lister uses cache, it will return a deleted subscription
until the cache is synced.
- The new installplan object may get an owner ref that points to the
deleted subscription.
- GC garbage collects the deleted subscription and consequently
deletes the new InstallPlan.
- Subscription reconciler reports that the new InstallPlan object is
missing and moves the Subscription to a Failed state.
The api audit log has entries that validates that GC is rightfully
"deleting" the new InstallPlan object.
Fix:
- For now, use a direct non-cached client to retrieve the list of
Subscription.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1744245
Jira: https://jira.coreos.com/browse/OLM-1245
0 commit comments