-
Notifications
You must be signed in to change notification settings - Fork 296
CA-422713: XSI-2105: Pool.join failed due to AD status corrupt #6832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
91c032c to
81c0764
Compare
lindig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks clean to me. We could use a default for the force parameter and then even fewer changes would be required.
ocaml/xapi/xapi_host.ml
Outdated
| (* 3. CP-703: we always revalidate all sessions after the external authentication has been disabled *) | ||
| (* so that all sessions that were externally authenticated will be destroyed *) | ||
| debug | ||
| "calling revalidate_all_sessions after disabling external auth \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to leave logging to revalidate_all_sessions. It is indeed an unusual action that should be always logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I did not understand the point here, I did not update any code, just put it under a condition. (master here).
Given the session management is a pool-level stuff, My latest update move it to xapi_pool
| match plugin_disable_failure with | ||
| | None -> | ||
| (* we do not want to stop pool_eject and permit Extauth_is_disabled during force *) | ||
| | Some e when during_pool_eject || (e = Extauth_is_disabled && force) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the use of when
|
If this is a WIP, maybe worth converting to a draft? |
81c0764 to
ad9fb6d
Compare
|
xenrt test looks good: 232815 (Dev Run) |
ad9fb6d to
a07cb45
Compare
It is ready for review now. |
| let host = Client.Host.get_by_uuid ~rpc ~session_id ~uuid:host_uuid in | ||
| let config = read_map_params "config" params in | ||
| Client.Host.disable_external_auth ~rpc ~session_id ~host ~config | ||
| Client.Host.disable_external_auth ~rpc ~session_id ~host ~config ~force:true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not let the user pass the force value like xe host-disable-external-auth --force=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is let use use --force=true per my test.
note: force=true|--force is forced to reach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the command requires --force to make the user aware of it is a recover command only. So the user can't select force:false. It is intended, Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, This API is hidden, only used for specific purpose with --force for awareness.
| debug | ||
| "The external authentication of all hosts in the pool was disabled \ | ||
| successfully" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it change the behavior? If the pool disable partially fails, some hosts have been disabled. Does it need to clear the session?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, we should not clear the session.
That is also the problem of the old code.
| (* this call will return an exception if something goes wrong *) | ||
| Xapi_host.disable_external_auth_common ~during_pool_eject:true ~__context | ||
| ~host ~config:[] () ; | ||
| ~host ~config:[] ~force:false () ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear for what cases we need force:false? What's the benefit of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not use force by default, until user want to use it manually.
a07cb45 to
5b72cbb
Compare
ocaml/idl/datamodel_host.ml
Outdated
| param_type= Bool | ||
| ; param_name= "force" | ||
| ; param_doc= "Disable external auth even when not enabled" | ||
| ; param_release= numbered_release "26.1.0-next" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
release number need to be updated before merge. 26.2.0 is tagged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has been pending for a while 🪐
The target pool has leaved AD, the joining host leave AD as well. However, the AD status is somehow corrupt - external_auth_type is empty, this is expected - external_auth_service_name is a valid domain This confused pool.join as it thinks AD is not enabled, but somehow joined to a domain. - Normal domain leave does not resolve the issue, and it does not join domain - Join domain again(failed) does not resolve it neither, as xapi will restore to the current value before join on failed. This commit introduce force option to host.disable_external_auth API to force clean up to recover host BTW, current code try to keep them consistent already, but not atomic. Signed-off-by: Lin Liu <[email protected]>
5b72cbb to
d73437c
Compare
The target pool has leaved AD, the joining host leave AD as well. However, the AD status is somehow corrupt
This confused pool.join as it thinks AD is not enabled, but somehow joined to a domain.
This commit introduce force option to host.disable_external_auth API
to force clean up to recover host
BTW, current code try to keep them consistent already, but not atomic.