-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[ws-daemon] start backup even pod still report the container is running after 5 minutes #20382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/hold |
|
@iQQBot How to test this change, could you add a description? What do I have to do to keep the workspace "stuck"? |
| url: | ||
| type: string | ||
| required: | ||
| - podRecreated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iQQBot Should we really make this part of this PR?
I'd love to drop it, and ship it at a later point. Just trying to avoid this complicating operations until we are sure this is rolled out to all installations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(e.g. gitpod.io)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's generated code, if you want this to be optional, you need add annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not "just delete this line"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure we can just delete the line for now to achieve the desired effect. E.g. it worked the last couple of weeks. 😉
Would be great to add that annotation, though, if you know how to do that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't made any updates to the workspace cluster on gitpod.io in the past few weeks either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we could have some manual operations going on, considering that we need to renew soon.
And if somebody forgets to update any of the three of 1. CRD, 2. ws-daemon, 3. ws-manager, they will accidentally block workspace starts.
|
|
||
| if ws.Status.Phase == workspacev1.WorkspacePhaseStopping && old.Phase != workspacev1.WorkspacePhaseStopping { | ||
| t := metav1.Now() | ||
| ws.Status.PodStoppingTime = &t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this line to be placed in status.go updateWorkspaceStatus, as this method's responsibility is something else.
Could we move it to the end of function updateWorkspaceStatus maybe...? 🤔
|
I tested the following:
@iQQBot Is there more to test here? 🤔 Also, the approach itself looks safe to me:
|
geropl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, tested and works! ✔️
Thank you! 🙏
…ng after 5 minutes
|
/unhold |
Description
We encountered some workspaces where the workspace container was still reported as running for several hours after the timeout, until the node was deleted.
This led to data loss, this PR make the backup was started five minutes after the pod was marked for deletion regardless.
[ws-daemon] start backup even pod still report the container is running after 5 minutes
Related Issue(s)
Fixes #
How to test
Documentation
Preview status
Gitpod was successfully deployed to your preview environment.
Build Options
Build
Run the build with werft instead of GHA
Run Leeway with
--dont-testPublish
Installer
Add desired feature flags to the end of the line above, space separated
Preview Environment / Integration Tests
If enabled this will build
install/previewIf enabled this will create the environment on GCE infra
Saves cost. Untick this only if you're really sure you need a non-preemtible machine.
Valid options are
all,workspace,webapp,ide,jetbrains,vscode,ssh. If enabled,with-previewandwith-large-vmwill be enabled./hold