Skip to content

Conversation

nmesot
Copy link

@nmesot nmesot commented Sep 12, 2024

This fixes the behavior of the health node not being available as soon as all nodes are marked for shutdown, but still running. A cluster without an operating health node will return "No disk usage data" through its health API.

This introduces a cluster setting to enable this behavior, with the default being the previous behavior.

I also had to modify the part of the code that immediately evicts any running tasks as soon as a node is marked for shutdown, leading to an endless loop of eviction -> reassignment -> eviction -> ...

Fixes #92193

…own (elastic#92193)

This fixes the behavior of the health node not being available as soon as all nodes are marked for shutdown, but still running. A cluster without an operating health node will return "No disk usage data" through its health API.

This introduces a cluster setting to enable this behavior, with the default being the previous behavior.

I also had to modify the part of the code that immediately evicts any running tasks as soon as a node is marked for shutdown, leading to an endless loop of eviction -> reassignment -> eviction -> ...

Fixes elastic#92193
Copy link
Contributor

Documentation preview:

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.0.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 12, 2024
@javanna javanna added :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown and removed needs:triage Requires assignment of a team area label labels Sep 13, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Sep 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Sep 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@ldematte
Copy link
Contributor

buildkite test this

@ldematte
Copy link
Contributor

@elasticmachine update branch

@nmesot
Copy link
Author

nmesot commented Sep 29, 2024

@dakrone @andreidan Could you have a look at this? Our client is asking for an update on this feature.

I am relatively new to open source contributions. If there is anything that I still need to do for this to go through, feel free to let me know, I am happy to keep working on it.

@andreidan andreidan requested a review from gmarouli October 2, 2024 12:51
@gmarouli
Copy link
Contributor

gmarouli commented Oct 3, 2024

Hi @nmesot , thank you for picking up this issue. I will start the review.

@gmarouli gmarouli self-assigned this Oct 3, 2024
For certain lightweight, persistent tasks (like the health node), it makes sense to not
have this limitation.

NOTE: For the health node for example, this provides the benefit of having accurate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this note is needed here; better add this information to the changelog.

private final AtomicReference<HealthNode> currentTask = new AtomicReference<>();
private final ClusterStateListener taskStarter;
private final ClusterStateListener shutdownListener;
private Boolean allowLightweightAssignmentsToNodesShuttingDown;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a Boolean and not a boolean?

}

private static boolean isEveryNodeShuttingDown(ClusterState clusterState, ClusterChangedEvent event) {
return clusterState.nodes().getAllNodes().stream().allMatch(dn -> event.state().metadata().nodeShutdowns().contains(dn.getId()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why using clusterState form clusterService.state() when you already have event.state()? Am I missing something here?

@ldematte
Copy link
Contributor

ldematte commented Oct 4, 2024

Looks like a great contribution overall, thanks!
I'll let @gmarouli do the main review, we discussed offline a couple of issues, I just added some minor comments here and there.

Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nmesot , thank you again for putting this effort and opening a very good PR. I have left some comments that hopefully will relax the complexity a bit but it looks really good.

Comment on lines +60 to +67
public static final Setting<Boolean> CLUSTER_TASKS_ALLOCATION_ALLOW_LIGHTWEIGHT_ASSIGNMENTS_TO_NODES_SHUTTING_DOWN_SETTING = Setting
.boolSetting(
"cluster.persistent_tasks.allocation.allow_lightweight_assignments_to_nodes_shutting_down",
false,
Setting.Property.Dynamic,
Setting.Property.NodeScope
);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is safe enough to not add a switch. Each task that defines itself as lightweight should ensure that it is safe to run always like that or that it needs the flag. What do you think?

Comment on lines +367 to +369
// nodes are available. An example of a lightweight task is the HealthNodeTask.
// This behavior needs to be enabled with the CLUSTER_TASKS_ALLOCATION_ALLOW_LIGHTWEIGHT_ASSIGNMENTS_TO_NODES_SHUTTING_DOWN_SETTING
// setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment is clear enough without the example. Someone can always check which tasks implement the isLightweight.

final List<DiscoveryNode> allNodes = currentState.nodes().stream().toList();
final List<DiscoveryNode> candidateNodes = allNodes.stream()
.filter(dn -> currentState.metadata().nodeShutdowns().contains(dn.getId()) == false)
.collect(Collectors.toCollection(ArrayList::new));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use .toList() here too.

.filter(dn -> currentState.metadata().nodeShutdowns().contains(dn.getId()) == false)
.collect(Collectors.toCollection(ArrayList::new));

if (candidateNodes.isEmpty() && allNodes.isEmpty() == false) { // all nodes are shutting down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think creating a new collection allNodes is not necessary. I think currentState.nodes().getNodes().isEmpty() will not create new objects and therefore be more efficient.

* {@link PersistentTasksClusterService#CLUSTER_TASKS_ALLOCATION_ALLOW_LIGHTWEIGHT_ASSIGNMENTS_TO_NODES_SHUTTING_DOWN_SETTING}
* needs to be set for this behavior to apply.
*/
default boolean isLightweight() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing this term was inspired by the original issue. However, this term is not established and it was more used to describe the health node task. Since we do not have any other usage for this method, I suggest we give a more specific name, something along the lines of keepRunningDuringShutdown or canRunOnShuttingDownNodes. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ for canRunOnShuttingDownNodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown :Data Management/Health >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team v9.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the health node available even if all nodes are marked for shutdown
6 participants