Skip to content

Conversation

feloy
Copy link
Contributor

@feloy feloy commented Nov 22, 2024

What does this PR do?

check connectivity with health check, in the context of experimental kubernetes states,
and saves the state in a ContextsConnectivityRegistry.

The next step is to dispatch the state to the frontend.

(The first commit of the PR is a copy of the one in #10074)

Screenshot / video of UI

backend only

What issues does this PR fix or reference?

Part of #9938

How to test this PR?

To test the new version, set this preference in ~/.local/share/containers/podman-desktop/configuration/settings.json:

  "kubernetes.statesExperimental": true

You can have some cluster behind a VPN to provoke some timeout and check by editing the kubeconfig file several times quickly (by changing the current context for example), that previous checks are aborted and only a response from the last (un-aborted) check is received.

Some logs can be added to the ContextsConnectivityRegistry to display the state each time it is changed

  • Tests are covering the bug fix or the new feature

@feloy feloy requested review from benoitf and a team as code owners November 22, 2024 15:39
@feloy feloy requested review from dgolovin and SoniaSandler and removed request for a team November 22, 2024 15:40
@feloy feloy force-pushed the feat-9938/check-connectivity-with-health-check branch from 0b546d1 to 0eabf41 Compare November 25, 2024 07:03
Signed-off-by: Philippe Martin <[email protected]>
Signed-off-by: Philippe Martin <[email protected]>
Signed-off-by: Philippe Martin <[email protected]>
Signed-off-by: Philippe Martin <[email protected]>
Signed-off-by: Philippe Martin <[email protected]>
@feloy feloy requested a review from axel7083 November 25, 2024 10:32
Copy link
Collaborator

@benoitf benoitf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO I don't see the need of the registry, and the event in the health checker as it can just keep the value

ExperimentalChecker:

  • map of healthzChecker
    for each checker there is inside the value if it's reachable or not

many disposables can be removed, and half of the code

@feloy
Copy link
Contributor Author

feloy commented Nov 25, 2024

IMHO I don't see the need of the registry, and the event in the health checker as it can just keep the value

ExperimentalChecker:

  • map of healthzChecker
    for each checker there is inside the value if it's reachable or not

many disposables can be removed, and half of the code

I would like to validate that this pattern is valid with the idea that we want to send to the frontend all these states Map<contextName: string, state: { checking, reachable }> every time one of the state changes. In the current pattern, the registry is used to save this data ready to be sent, and as the object sending the data every time a set is called.

With this new pattern, we would also need some signal from the checkers to know when the state of a checker changes and to collect all the data from the checkers to send it.

@feloy
Copy link
Contributor Author

feloy commented Nov 25, 2024

I agree that the onReadiness(); checkReadiness(); is confusing, and that setting checking and reachable not atomically is confusing.

IMHO, as the checkers have a short life (they are killed as soon as the kubeconfig is modified again), it seems not reasonable to keep the state of the check inside them, as they will be deleted / recreated every time the update is called.

Here is a new proposal by simplfying the onReadiness / checkReadiness part and having an atomic setState (prototype, tests need to be finalized)

The registry is very simple, but I expect to add code to send the data to the frontend in this registry.

@benoitf WDYT?

@benoitf
Copy link
Collaborator

benoitf commented Nov 25, 2024

I think my issue is why having a separate registry. State should be closer to the object being checked.

IMHO, as the checkers have a short life (they are killed as soon as the kubeconfig is modified again), it seems not reasonable to keep the state of the check inside them, as they will be deleted / recreated every time the update is called.

On my side I see no issue having the state part of the object responsible to check the state of a context.
Also why they would be recreated/deleted if their data is the same as before ?
I can update the configuration of another context
They're the object closest to the state of the cluster. (the 'isolated kubeconfig part') If this part is not changing no need to delete/recreate them.

If you don't want to keep it as part of the HealthCheck object, make another layer between the two, but not a side instance
ContextChecker instance would use a HealthCheck instance.

Also the HealthCheck class is dedicated to a specific context as it's using a kubeconfig object which is a context. So I think it's missing in the name that it's for a specific Context. HealthCheckContext.

The registry is very simple, but I expect to add code to send the data to the frontend in this registry.

Yes it was my worries when I read the code. To me we should separate the inner state of the data vs what we want to expose publicly having the state the closest possible of the object maintaining it.

}

// start checking the readiness
public async start(opts?: CheckOptions): Promise<boolean> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if options are not part of the constructor

@feloy
Copy link
Contributor Author

feloy commented Nov 25, 2024

I think my issue is why having a separate registry. State should be closer to the object being checked.

IMHO, as the checkers have a short life (they are killed as soon as the kubeconfig is modified again), it seems not reasonable to keep the state of the check inside them, as they will be deleted / recreated every time the update is called.

On my side I see no issue having the state part of the object responsible to check the state of a context. Also why they would be recreated/deleted if their data is the same as before ? I can update the configuration of another context They're the object closest to the state of the cluster. (the 'isolated kubeconfig part') If this part is not changing no need to delete/recreate them.

OK, I was checking if the whole kubeconfig is changed or not, but we can do the check at the isolated kube config level, and creating a health checker only if its specific context changes. This will make the checkers long-running, and make your pattern possible

I'll make the changes.

Thanks for your feedback

@feloy feloy marked this pull request as draft November 25, 2024 15:25
Signed-off-by: Philippe Martin <[email protected]>
@feloy feloy added the area/ci label Nov 26, 2024
Signed-off-by: Philippe Martin <[email protected]>
Signed-off-by: Philippe Martin <[email protected]>
}

update(): void {
console.log('current health check states', this.manager.getHealthCheckersStates());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want to merge with this, but it can be useful for testing/reviewing, and I can remove it when LGTM'd

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me to keep this trace

@feloy feloy requested review from axel7083 and benoitf November 26, 2024 13:28
@feloy feloy marked this pull request as ready for review November 26, 2024 13:28
@feloy feloy removed the area/ci label Nov 26, 2024
Copy link
Collaborator

@benoitf benoitf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for all the improvements

it looks like easier to understand, and each part is doing and managing its own data, so in case of failure, easy to know which part is failing and it we want to enhance parts, contract being clear, we can easily swap one component by another

Copy link
Contributor

@axel7083 axel7083 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@feloy feloy merged commit fee8ac7 into podman-desktop:main Nov 26, 2024
27 of 30 checks passed
@podman-desktop-bot podman-desktop-bot added this to the 1.15.0 milestone Nov 26, 2024
danivilla9 pushed a commit to danivilla9/podman-desktop that referenced this pull request Nov 27, 2024
* feat: check cluster connectivity with health check
Signed-off-by: Philippe Martin <[email protected]>

* fix: abort health check on dispose
Signed-off-by: Philippe Martin <[email protected]>

* fix: use dispose and EventEmitter
Signed-off-by: Philippe Martin <[email protected]>

* fix: create Health in constructor
Signed-off-by: Philippe Martin <[email protected]>

* fix: remove comment on if
Signed-off-by: Philippe Martin <[email protected]>

* fix: call JSON.stringify only once
Signed-off-by: Philippe Martin <[email protected]>

* fix: do not use onReadiness event + make set check/reachable atomic
Signed-off-by: Philippe Martin <[email protected]>

* feat: rewrite by using events
Signed-off-by: Philippe Martin <[email protected]>

* fix: review
Signed-off-by: Philippe Martin <[email protected]>

* fix: more unit tests
Signed-off-by: Philippe Martin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants