Skip to content
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
eefbfff
[D&M] Drafts maintenance intro.
szabosteve Feb 12, 2025
3a7999f
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 12, 2025
4bd7d08
[E&A] Removes Kibana .-related info.
szabosteve Feb 12, 2025
58c8731
Merge branch 'szabosteve/maintenance' of github.com:elastic/docs-cont…
szabosteve Feb 12, 2025
3dc5dfa
[D&M] Refines start and stop ES page.
szabosteve Feb 13, 2025
9e1df49
[D&M] Links.
szabosteve Feb 13, 2025
97ed2bf
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 13, 2025
718b74c
[D&M] Adds intro for start and stop services.
szabosteve Feb 13, 2025
c828103
[D&M] ECE maintenance.
szabosteve Feb 13, 2025
d2a2738
[D&M] Fixes links.
szabosteve Feb 13, 2025
3dd4b47
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 13, 2025
bf5b63e
[D&M] Deployments maintenance and request routing.
szabosteve Feb 17, 2025
fd0bfe3
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 17, 2025
5fa1ae6
[M&D] Fixes links.
szabosteve Feb 17, 2025
d7b964b
[D&M] Adds maintenance activities section.
szabosteve Feb 17, 2025
b9ecbc6
[D&M] Refines host maintenance, scale out installtion.
szabosteve Feb 17, 2025
5eb9d82
[D&M] Refines start and stop services.
szabosteve Feb 17, 2025
3821471
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 17, 2025
e14ae74
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 17, 2025
68b37f1
Fixes conflicts.
szabosteve Feb 18, 2025
f7148e9
Merge branch 'szabosteve/maintenance' of github.com:elastic/docs-cont…
szabosteve Feb 18, 2025
ef5d5f1
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 18, 2025
57e9042
[D&M] Fixes links.
szabosteve Feb 18, 2025
d78328a
Merge branch 'szabosteve/maintenance' of github.com:elastic/docs-cont…
szabosteve Feb 18, 2025
7ab401f
[D&M] Addresses feedback part 1.
szabosteve Feb 19, 2025
47f301c
Apply suggestions from code review
szabosteve Feb 19, 2025
797f02a
Update deploy-manage/maintenance/ece/start-stop-routing-requests.md
szabosteve Feb 19, 2025
bfeb442
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 19, 2025
b5cd996
[D&M] Addresses feedback part 2.
szabosteve Feb 19, 2025
db5ca27
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 19, 2025
2091425
[D&M] Fixes errors.
szabosteve Feb 19, 2025
06cffdc
[D&M] Fixes links.
szabosteve Feb 19, 2025
2265940
[D&M] More link fix.
szabosteve Feb 19, 2025
537a698
Apply suggestions from code review
szabosteve Feb 19, 2025
ba9cc10
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 19, 2025
9675c6f
Update deploy-manage/maintenance/start-stop-services/start-stop-elast…
szabosteve Feb 19, 2025
7120338
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 19, 2025
7cc4aca
[D&M] Moves stop routing request skript instructions.
szabosteve Feb 19, 2025
dd86595
Merge branch 'szabosteve/maintenance' of github.com:elastic/docs-cont…
szabosteve Feb 19, 2025
514f94f
Merge branch 'main' into szabosteve/maintenance
szabosteve Feb 19, 2025
cf4c789
Update deploy-manage/maintenance/start-stop-services/full-cluster-res…
szabosteve Feb 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 6 additions & 17 deletions deploy-manage/maintenance.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,12 @@ mapped_pages:
- https://www.elastic.co/guide/en/cloud-enterprise/current/ece-manage-kibana.html
---

# Maintenance [ece-manage-kibana]
# Maintenance [maintenance]

Kibana is an open source analytics and visualization platform designed to work with Elasticsearch, that makes it easy to perform advanced data analysis and to visualize your data in a variety of charts, tables, and maps. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.
This section outlines the key tasks and processes required to maintain a healthy, performant, and secure {{es}} infrastructure and its deployments.

Most deployment templates include a Kibana instance, but if it wasn’t part of the initial deployment you can go to the **Kibana** page and **Enable** Kibana.

The new Kibana instance takes a few moments to provision. After provisioning Kibana is complete, you can use the endpoint URL to access Kibana.

::::{tip}
You can log into Kibana as the `elastic` superuser. The password was provided when you created your deployment or can be [reset](users-roles/cluster-or-deployment-auth/built-in-users.md). On AWS and not able to access Kibana? [Check if you need to update your endpoint URL first](../troubleshoot/deployments/cloud-enterprise/common-issues.md#ece-aws-private-ip).
::::


From the deployment **Kibana** page you can also:

* Terminate your Kibana instance, which stops it. The information is stored in your Elasticsearch cluster, so stopping and restarting should not risk your Kibana information.
* Restart it after stopping.
* Upgrade your Kibana instance version if it is out of sync with your Elasticsearch cluster.
* Delete to fully remove the instance, wipe it from the disk, and stop charges.
The topics covered include:

* **ECE Maintenance**: Explains the procedures for maintaining both the host infrastructure and {{es}} deployments within Elastic Cloud Enterprise (ECE).
* **Start and Stop services**: Provides step-by-step instructions on how to safely start and stop your {{es}} deployment or {{kib}} instance, particularly when performing actions that require a restart.
* **Add and remove {{es}} nodes**: Guides you through the process of enrolling new nodes or safely removing existing ones from an {{es}} cluster to optimize resource utilization and cluster performance.
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ When you add more nodes to a cluster, it automatically allocates replica shards.
:alt: A cluster with three nodes
:::


## Enroll nodes in an existing cluster [_enroll_nodes_in_an_existing_cluster_5]

You can enroll additional nodes on your local machine to experiment with how an {{es}} cluster with multiple nodes behaves.
Expand All @@ -31,7 +30,6 @@ To add a node to a cluster running on multiple machines, you must also set [`dis

::::


When {{es}} starts for the first time, the security auto-configuration process binds the HTTP layer to `0.0.0.0`, but only binds the transport layer to localhost. This intended behavior ensures that you can start a single-node cluster with security enabled by default without any additional configuration.

Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically.
Expand Down Expand Up @@ -64,21 +62,18 @@ To enroll new nodes in your cluster, create an enrollment token with the `elasti

For more information about discovery and shard allocation, refer to [*Discovery and cluster formation*](../distributed-architecture/discovery-cluster-formation.md) and [Cluster-level shard allocation and routing settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html).


## Master-eligible nodes [add-elasticsearch-nodes-master-eligible]

As nodes are added or removed Elasticsearch maintains an optimal level of fault tolerance by automatically updating the cluster’s *voting configuration*, which is the set of [master-eligible nodes](../distributed-architecture/clusters-nodes-shards/node-roles.md#master-node-role) whose responses are counted when making decisions such as electing a new master or committing a new cluster state.

It is recommended to have a small and fixed number of master-eligible nodes in a cluster, and to scale the cluster up and down by adding and removing master-ineligible nodes only. However there are situations in which it may be desirable to add or remove some master-eligible nodes to or from a cluster.


### Adding master-eligible nodes [modules-discovery-adding-nodes]

If you wish to add some nodes to your cluster, simply configure the new nodes to find the existing cluster and start them up. Elasticsearch adds the new nodes to the voting configuration if it is appropriate to do so.

During master election or when joining an existing formed cluster, a node sends a join request to the master in order to be officially added to the cluster.


### Removing master-eligible nodes [modules-discovery-removing-nodes]

When removing master-eligible nodes, it is important not to remove too many all at the same time. For instance, if there are currently seven master-eligible nodes and you wish to reduce this to three, it is not possible simply to stop four of the nodes at once: to do so would leave only three nodes remaining, which is less than half of the voting configuration, which means the cluster cannot take any further actions.
Expand Down Expand Up @@ -129,4 +124,3 @@ DELETE /_cluster/voting_config_exclusions
# to return to the voting configuration in the future.
DELETE /_cluster/voting_config_exclusions?wait_for_removal=false
```

29 changes: 26 additions & 3 deletions deploy-manage/maintenance/ece.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,30 @@
# ECE maintenance

% What needs to be done: Write from scratch
Elastic Cloud Enterprise (ECE), being a self-managed Elastic Stack deployment platform, abstracts much of the complexity of running {{es}}, but still requires regular maintenance at both the platform and deployment levels. Maintenance activities range from managing individual deployments to performing infrastructure-level updates on ECE hosts.

% GitHub issue: https://github.com/elastic/docs-projects/issues/353
## Deployment maintenance and host infrastructure maintenance [ece-deployment-host-infra-maintenance]

% Scope notes: Introduction about ECE maintenance and activities / actions. Explain the difference between deployments maintenance and ECE hosts infrastructure maintenance.
Deployment maintenance focuses on managing individual {{es}} and {{kib}} instances within ECE. This includes actions such as pausing instances, stopping request routing to nodes, and moving instances between allocators to optimize resource usage or prepare for maintenance. These tasks help maintain service availability and performance without affecting the underlying infrastructure.

ECE host infrastructure maintenance involves managing virtual machines that host ECE itself. This includes tasks like applying operating system patches, upgrading software, or decommissioning hosts. Infrastructure maintenance often requires more careful planning, as it can impact multiple deployments running on the affected hosts. Methods such as placing allocators into maintenance mode and redistributing workloads provide a smooth transition during maintenance operations.

This section provides guidance on best practices for both types of maintenance, helping you maintain a resilient ECE environment.

## Enabling Kibana [ece-manage-kibana]

{{kib}} is an open source analytics and visualization platform designed to work with {{es}}, that makes it easy to perform advanced data analysis and to visualize your data in a variety of charts, tables, and maps. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to {{es}} queries in real time.

Most deployment templates include a {{kib}} instance, but if it wasn’t part of the initial deployment you can go to the **{{kib}}** page and **Enable** {{kib}}.

The new {{kib}} instance takes a few moments to provision. After provisioning {{kib}} is complete, you can use the endpoint URL to access {{kib}}.

::::{tip}
You can log into Kibana as the `elastic` superuser. The password was provided when you created your deployment or can be [reset](../users-roles/cluster-or-deployment-auth/built-in-users.md). On AWS and not able to access Kibana? [Check if you need to update your endpoint URL first](../../troubleshoot/deployments/cloud-enterprise/common-issues.md#ece-aws-private-ip).
::::

From the deployment **{{kib}}** page you can also:

* Terminate your {{kib}} instance, which stops it. The information is stored in your {{es}} cluster, so stopping and restarting should not risk your {{kib}} information.
* Restart it after stopping.
* Upgrade your {{kib}} instance version if it is out of sync with your {{es}} cluster.
* Delete to fully remove the instance, wipe it from the disk, and stop charges.
12 changes: 4 additions & 8 deletions deploy-manage/maintenance/ece/delete-ece-hosts.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,18 @@ To delete hosts:

1. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md).
2. From the **Platform** menu, select **Hosts**.

Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.

3. For hosts that hold the allocator role:

1. [Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
2. [Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation.
1. [Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
2. [Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation.

4. Go to **Hosts** and select a host.
5. Select **Manage roles** from the **Manage host** menu and remove all assigned roles.
6. Select **Demote host** from the **Manage host** menu if present. If the **Delete host** option is already enabled, skip this step.
7. Remove *all running* containers from the host, starting from the container with name `frc-runners-runner`. Then remove the storage directory (the default `/mnt/data/elastic/`). You can use the recommended [cleanup command](../../uninstall/uninstall-elastic-cloud-enterprise.md). Upon doing so, the UI should reflect the host is **Disconnected**, allowing the host to be deleted.
8. Select **Delete host** and confirm.

::::{tip}
::::{tip}
Refresh the page if the button isn’t active.
::::


5 changes: 1 addition & 4 deletions deploy-manage/maintenance/ece/deployments-maintenance.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,9 @@ mapped_pages:

In some circumstances, you might need to temporarily restrict access to a node so you can perform corrective actions that might otherwise be difficult to complete. For example, if your cluster is being overwhelmed by requests because it is undersized for its workload, its nodes might not respond to efforts to resize.

These actions act as a maintenance mode for cluster node. Performing these actions can stop the cluster from becoming completely unresponsive so that you can resolve operational issues much more effectively.
These actions act as a maintenance mode for cluster node. Performing these actions can stop the cluster from becoming unresponsive so that you can resolve operational issues much more effectively.

* [**Stop routing to the instance**](start-stop-routing-requests.md): Block requests from being routed to the cluster node. This is a less invasive action than pausing the cluster.
* [**Pause an instance**](pause-instance.md): Suspend the node immediately by stopping the container that the node runs on without completing existing requests. This is a more aggressive action to regain control of an unresponsive node.

As an alternative, to quickly add capacity to a deployment if it is unhealthy or at capacity, you can also [override the resource limit for a deployment](../../deploy/cloud-enterprise/resource-overrides.md).



8 changes: 2 additions & 6 deletions deploy-manage/maintenance/ece/enable-maintenance-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,13 @@ To put an allocator into maintenance mode:
1. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md).
2. From the **Platform** menu, select **Allocators**.
3. Choose the allocator you want to work with and select **Enable Maintenance Mode**. Confirm the action.

Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.

Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.

After the allocator enters maintenance mode, no new Elasticsearch nodes or Kibana instances will be started on the allocator. Existing nodes will continue to work as expected. You can now safely perform actions like [moving nodes off the allocator](move-nodes-instances-from-allocators.md).

If you want to make the allocator fully active again, select **Disable Maintenance Mode**. Confirm the action.

::::{tip}
::::{tip}
If you need the existing instances to stop routing requests you can [stop routing requests](deployments-maintenance.md) to disable incoming requests to particular instances. You can also massively disable all allocator instances routing with the [allocator-toggle-routing-requests.sh](https://download.elastic.co/cloud/allocator-toggle-routing-requests.sh) script. The script runs with the following parameters in the form environment variables:

* `API_URL` Url of the administration API.
Expand All @@ -41,5 +39,3 @@ AUTH_HEADER="Authorization: ApiKey $(cat ~/api.key)" API_URL="https://adminconso
```

::::


44 changes: 41 additions & 3 deletions deploy-manage/maintenance/ece/maintenance-activities.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,45 @@
# Maintenance activities

% What needs to be done: Write from scratch
Maintenance activities ensure the smooth operation and scalability of your {{es}} installation. This section provides guidelines on performing essential maintenance tasks while minimizing downtime and maintaining high availability.

% GitHub issue: https://github.com/elastic/docs-projects/issues/353
## Available maintenance operations

% Scope notes: summarize the list of activites
### Enable maintenance mode

Before performing maintenance on an allocator, you should enable maintenance mode to prevent new Elasticsearch clusters and Kibana instances from being provisioned. This ensures that existing deployments can be safely moved to other allocators or adjusted without disruption.

### Scale out installation

You can scale out your installation by adding capacity to meet growing demand or improve high availability. This process involves installing ECE on additional hosts, assigning roles to new hosts, and resizing deployments to utilize the expanded resources.

### Move nodes and instances betwwen allocators

Moving {{es}} nodes, {{kib}} instances, and other components between allocators may be necessary to free up space, avoid downtime, or handle allocator failures. The process involves selecting target allocators and ensuring enough capacity to accommodate the migration.

### Perform ECE host maintenance

Maintaining ECE hosts is critical for applying system patches, performing hardware upgrades, and ensuring compliance with security standards. Different maintenance methods are available based on the level of disruption:

* Disabling the Docker daemon (nondestructive): Temporarily disables a host while keeping it in the installation.

* Deleting the host (destructive): Permanently removes a host, requiring reinstallation after maintenance.

* Shutting down the host (less destructive): Temporarily shuts down a host while preserving configurations for planned outages.

### Delete ECE hosts

If a host is no longer required or is faulty, it can be removed from the Elastic Cloud Enterprise installation. Deleting a host only removes it from the installation but does not uninstall the software from the physical machine. Before deletion, allocators should be placed in maintenance mode, and nodes should be migrated to avoid disruption.

## Best practices for maintenance

* Always check available capacity before making changes.

* Use maintenance mode to avoid unexpected disruptions.

* Move nodes strategically to maintain high availability.

* Perform maintenance during off-peak hours when possible.

* Regularly review and optimize resource allocation.

By following these guidelines, you can ensure the stability and efficiency of your environment while carrying out necessary maintenance activities.
Loading
Loading