Skip to content

Comments

Get cluster state#553

Merged
arbulu89 merged 3 commits intomainfrom
get-cluster-state
Feb 19, 2026
Merged

Get cluster state#553
arbulu89 merged 3 commits intomainfrom
get-cluster-state

Conversation

@arbulu89
Copy link
Contributor

Description

Get and send the cluster state value in the cluster discovery payload.
I have replaced the usage of cs_clusterstate as this utility is not coming directly in the pacemaker package, and it required to install ClusterTools2 package, which doesn't happen by default all the time.
crmadmin comes natively with pacemaker.

cs_clusterstate actually simply runs the crmadmin -D and crmadmin -S {node} in sequence, so we are basically replicating the usage.

crmadmin -D (and cs_clusterstate) has the chance to get hanging forever if we query it when the DC is not yet selected or in boot up. That's why I added the context timeout of 2 seconds, which is more than enough for regular command execution.

You can find all the possible cluster state values here:
https://github.com/ClusterLabs/pacemaker/blob/main/daemons/controld/controld_fsa.h

I'm simply removing the initial S_ (this is just an internal code detail) and downcasing

How was this tested?

UT and manual testing in real infra

Did you update the documentation?

No documentation needed for this agent change.

@arbulu89 arbulu89 added the enhancement New feature or request label Feb 18, 2026
@arbulu89 arbulu89 marked this pull request as ready for review February 18, 2026 13:39
Copy link

@antgamdia antgamdia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, Good to see we have extracted it into its own operation instead of running commands deep inside the code.

Maybe there is a reason I don't know about... but I wonder why we need to diverge from what pacemaker yields. Trento users are, for sure, used to seeing S_IDLE et al. Why transform them into a Trento-only format? Besides, for internal usage, I guess it is easier to compare against the "S_XXX" rather than "xxx".
Example: num of occurrences of "S_STARTING" vs "starting".

Later on, any presentation layer could just translate those constants into what is more beneficial for the user. Whether it be a nice graphics or a verbose explanation for AI consumers.

DC: false,
Provider: "",
Online: true,
State: strings.ToLower(strings.TrimPrefix(state, "S_")),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: don't quite like having this logic in the struct or, at least, without a comment.

A future "me" would like to know where this state is coming from and why we are trimming something 😅 . Maybe sth similar to what you already mentioned:

https://github.com/ClusterLabs/pacemaker/blob/main/daemons/controld/controld_fsa.h
I'm simply removing the initial S_ (this is just an internal code detail) and downcasing

@arbulu89
Copy link
Contributor Author

Maybe there is a reason I don't know about... but I wonder why we need to diverge from what pacemaker yields. Trento users are, for sure, used to seeing S_IDLE et al. Why transform them into a Trento-only format? Besides, for internal usage, I guess it is easier to compare against the "S_XXX" rather than "xxx". Example: num of occurrences of "S_STARTING" vs "starting".

That's a fair question. I personally dislike the initial S_ prefix, as it is just an internal constant declaration convention. I think that it looks pretty old style and doesn't provide any real value for the user.

Besides, as I guess this all about is about what we provide in the web in the API, I didn't want to mix standard internal Trento constants like unknown/stopped with S_IDLE/S_TRANSITION_ENGINE, which look totally out of place (not all the values are coming from pacemaker). And putting UNKNOWN/STOPPED looks ugly XD
This last was the biggest motivation to trim the S_ and downcase.

If we want to keep the values 1:1 from pacemaker and mix them with a couple of other formats, well, I can do that (even though I dislike hehe).

@abravosuse @jagabomb @antgamdia This is kind of UX/UI decision to take. Opinions?

@abravosuse
Copy link

I plan to update the Operations section in the User Documentation explaining where the cluster state information is coming from. With that in mind, yes, it might be easier to document it if we leave the values AS-IS, with the "S_" prefix. Yes, it's less "human" friendly but more straightforward. So I am OK with it.

Copy link
Contributor

@gagandeepb gagandeepb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I have a slight preference for not removing the S_ (so that values received by web are cross reference-able with official docs and the state values there-in).
One thing that would be nice for users of this code, IMO would be some how either documenting the possible state values either via a comment or better yet (if possible) using an enum/const type.

@arbulu89
Copy link
Contributor Author

@antgamdia @gagandeepb Alright!
I have remove the trimming of the S_ value.
I hope the AI appreciates our effort 🙈

Copy link
Contributor

@skrech skrech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me.

I just wanted to make a general comment about mocks -- I think we might consider wrapping the usage of CommandExecutor into functions/methods of the CmdClilent inteface. That way, we can use the mock for the CmdClient itself to set expectations on.

Mocks set on command-line strings seems very fragile to me and actually doesn't test much. For example, if you modify the command crmadmin -qD by adding -v flag for verbosity you have to go change all the test fixtures. If you wrapped that command into getDcNode func, nothing would need to be changed in the tests.

Anyway, this is just a general comment, no need to modify anything right now.

Copy link
Contributor

@gagandeepb gagandeepb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@arbulu89
Copy link
Contributor Author

The change looks good to me.

I just wanted to make a general comment about mocks -- I think we might consider wrapping the usage of CommandExecutor into functions/methods of the CmdClilent inteface. That way, we can use the mock for the CmdClient itself to set expectations on.

Mocks set on command-line strings seems very fragile to me and actually doesn't test much. For example, if you modify the command crmadmin -qD by adding -v flag for verbosity you have to go change all the test fixtures. If you wrapped that command into getDcNode func, nothing would need to be changed in the tests.

Anyway, this is just a general comment, no need to modify anything right now.

Yes, I guess we are trying to move in that direction. I guess there are still some older layers that use the command line out of the cmd client line.

@arbulu89 arbulu89 merged commit 11fc986 into main Feb 19, 2026
7 checks passed
@arbulu89 arbulu89 deleted the get-cluster-state branch February 19, 2026 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Development

Successfully merging this pull request may close these issues.

5 participants