Skip to content

Conversation

@liulinC
Copy link
Collaborator

@liulinC liulinC commented Jan 14, 2026

The target pool has leaved AD, the joining host leave AD as well. However, the AD status is somehow corrupt

  • external_auth_type is empty, this is expected
  • external_auth_service_name is a valid domain

This confused pool.join as it thinks AD is not enabled, but somehow joined to a domain.

  • Normal domain leave does not resolve the issue, and it does not join domain
  • Join domain again(failed) also does not resolve it, as xapi will restore to the current value before join on failed.

This commit introduce force option to host.disable_external_auth API
to force clean up to recover host
BTW, current code try to keep them consistent already, but not atomic.

@liulinC liulinC force-pushed the private/linl/XSI-2105 branch 3 times, most recently from 91c032c to 81c0764 Compare January 15, 2026 04:47
@liulinC liulinC changed the title CA-422713: XSI-2105: Pool.join failed due to AD status corrupt (WIP) CA-422713: XSI-2105: Pool.join failed due to AD status corrupt Jan 15, 2026
Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks clean to me. We could use a default for the force parameter and then even fewer changes would be required.

(* 3. CP-703: we always revalidate all sessions after the external authentication has been disabled *)
(* so that all sessions that were externally authenticated will be destroyed *)
debug
"calling revalidate_all_sessions after disabling external auth \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to leave logging to revalidate_all_sessions. It is indeed an unusual action that should be always logged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not understand the point here, I did not update any code, just put it under a condition. (master here).
Given the session management is a pool-level stuff, My latest update move it to xapi_pool

match plugin_disable_failure with
| None ->
(* we do not want to stop pool_eject and permit Extauth_is_disabled during force *)
| Some e when during_pool_eject || (e = Extauth_is_disabled && force) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the use of when

@last-genius
Copy link
Contributor

If this is a WIP, maybe worth converting to a draft?

@liulinC liulinC force-pushed the private/linl/XSI-2105 branch from 81c0764 to ad9fb6d Compare January 20, 2026 07:38
@liulinC
Copy link
Collaborator Author

liulinC commented Jan 20, 2026

xenrt test looks good: 232815 (Dev Run)

@liulinC liulinC changed the title (WIP) CA-422713: XSI-2105: Pool.join failed due to AD status corrupt CA-422713: XSI-2105: Pool.join failed due to AD status corrupt Jan 20, 2026
@liulinC liulinC force-pushed the private/linl/XSI-2105 branch from ad9fb6d to a07cb45 Compare January 20, 2026 08:00
@liulinC
Copy link
Collaborator Author

liulinC commented Jan 21, 2026

If this is a WIP, maybe worth converting to a draft?

It is ready for review now.

let host = Client.Host.get_by_uuid ~rpc ~session_id ~uuid:host_uuid in
let config = read_map_params "config" params in
Client.Host.disable_external_auth ~rpc ~session_id ~host ~config
Client.Host.disable_external_auth ~rpc ~session_id ~host ~config ~force:true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not let the user pass the force value like xe host-disable-external-auth --force=true

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is let use use --force=true per my test.
note: force=true|--force is forced to reach here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the command requires --force to make the user aware of it is a recover command only. So the user can't select force:false. It is intended, Right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, This API is hidden, only used for specific purpose with --force for awareness.

debug
"The external authentication of all hosts in the pool was disabled \
successfully"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it change the behavior? If the pool disable partially fails, some hosts have been disabled. Does it need to clear the session?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we should not clear the session.
That is also the problem of the old code.

(* this call will return an exception if something goes wrong *)
Xapi_host.disable_external_auth_common ~during_pool_eject:true ~__context
~host ~config:[] () ;
~host ~config:[] ~force:false () ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not clear for what cases we need force:false? What's the benefit of it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not use force by default, until user want to use it manually.

@liulinC liulinC force-pushed the private/linl/XSI-2105 branch from a07cb45 to 5b72cbb Compare January 21, 2026 07:03
param_type= Bool
; param_name= "force"
; param_doc= "Disable external auth even when not enabled"
; param_release= numbered_release "26.1.0-next"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release number need to be updated before merge. 26.2.0 is tagged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has been pending for a while 🪐

The target pool has leaved AD, the joining host leave AD as well.
However, the AD status is somehow corrupt
- external_auth_type is empty, this is expected
- external_auth_service_name is a valid domain

This confused pool.join as it thinks AD is not enabled, but
somehow joined to a domain.

- Normal domain leave does not resolve the issue, and it does not
join domain
- Join domain again(failed) does not resolve it neither, as xapi will
restore to the current value before join on failed.

This commit introduce force option to host.disable_external_auth API
to force clean up to recover host

BTW, current code try to keep them consistent already, but not atomic.

Signed-off-by: Lin Liu <[email protected]>
@liulinC liulinC force-pushed the private/linl/XSI-2105 branch from 5b72cbb to d73437c Compare January 21, 2026 09:36
@liulinC liulinC added this pull request to the merge queue Jan 21, 2026
@liulinC liulinC removed this pull request from the merge queue due to a manual request Jan 21, 2026
@liulinC liulinC added this pull request to the merge queue Jan 21, 2026
Merged via the queue into xapi-project:master with commit 8116cf5 Jan 21, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants