elastic · szabosteve · Feb 19, 2025 · Feb 12, 2025 · Feb 12, 2025 · Feb 12, 2025
@@ -3,23 +3,12 @@ mapped_pages:
   - https://www.elastic.co/guide/en/cloud-enterprise/current/ece-manage-kibana.html
 ---
 
-# Maintenance [ece-manage-kibana]
+# Maintenance [maintenance]
 
-Kibana is an open source analytics and visualization platform designed to work with Elasticsearch, that makes it easy to perform advanced data analysis and to visualize your data in a variety of charts, tables, and maps. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.
+This section outlines the key tasks and processes required to maintain a healthy, performant, and secure {{es}} infrastructure and its deployments.
 
-Most deployment templates include a Kibana instance, but if it wasn’t part of the initial deployment you can go to the **Kibana** page and **Enable** Kibana.
-
-The new Kibana instance takes a few moments to provision. After provisioning Kibana is complete, you can use the endpoint URL to access Kibana.
-
-::::{tip} 
-You can log into Kibana as the `elastic` superuser. The password was provided when you created your deployment or can be [reset](users-roles/cluster-or-deployment-auth/built-in-users.md). On AWS and not able to access Kibana? [Check if you need to update your endpoint URL first](../troubleshoot/deployments/cloud-enterprise/common-issues.md#ece-aws-private-ip).
-::::
-
-
-From the deployment **Kibana** page you can also:
-
-* Terminate your Kibana instance, which stops it. The information is stored in your Elasticsearch cluster, so stopping and restarting should not risk your Kibana information.
-* Restart it after stopping.
-* Upgrade your Kibana instance version if it is out of sync with your Elasticsearch cluster.
-* Delete to fully remove the instance, wipe it from the disk, and stop charges.
+The topics covered include:
 
+* **ECE Maintenance**: Explains the procedures for maintaining both the host infrastructure and {{es}} deployments within Elastic Cloud Enterprise (ECE).
+* **Start and Stop services**: Provides step-by-step instructions on how to safely start and stop your {{es}} deployment or {{kib}} instance, particularly when performing actions that require a restart.
+* **Add and remove {{es}} nodes**: Guides you through the process of enrolling new nodes or safely removing existing ones from an {{es}} cluster to optimize resource utilization and cluster performance.
@@ -21,7 +21,6 @@ When you add more nodes to a cluster, it automatically allocates replica shards.
 :alt: A cluster with three nodes
 :::
 
-
 ## Enroll nodes in an existing cluster [_enroll_nodes_in_an_existing_cluster_5]
 
 You can enroll additional nodes on your local machine to experiment with how an {{es}} cluster with multiple nodes behaves.
@@ -31,7 +30,6 @@ To add a node to a cluster running on multiple machines, you must also set [`dis
 
 ::::
 
-
 When {{es}} starts for the first time, the security auto-configuration process binds the HTTP layer to `0.0.0.0`, but only binds the transport layer to localhost. This intended behavior ensures that you can start a single-node cluster with security enabled by default without any additional configuration.
 
 Before enrolling a new node, additional actions such as binding to an address other than `localhost` or satisfying bootstrap checks are typically necessary in production clusters. During that time, an auto-generated enrollment token could expire, which is why enrollment tokens aren’t generated automatically.
@@ -64,21 +62,18 @@ To enroll new nodes in your cluster, create an enrollment token with the `elasti
 
 For more information about discovery and shard allocation, refer to [*Discovery and cluster formation*](../distributed-architecture/discovery-cluster-formation.md) and [Cluster-level shard allocation and routing settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html).
 
-
 ## Master-eligible nodes [add-elasticsearch-nodes-master-eligible]
 
 As nodes are added or removed Elasticsearch maintains an optimal level of fault tolerance by automatically updating the cluster’s *voting configuration*, which is the set of [master-eligible nodes](../distributed-architecture/clusters-nodes-shards/node-roles.md#master-node-role) whose responses are counted when making decisions such as electing a new master or committing a new cluster state.
 
 It is recommended to have a small and fixed number of master-eligible nodes in a cluster, and to scale the cluster up and down by adding and removing master-ineligible nodes only. However there are situations in which it may be desirable to add or remove some master-eligible nodes to or from a cluster.
 
-
 ### Adding master-eligible nodes [modules-discovery-adding-nodes]
 
 If you wish to add some nodes to your cluster, simply configure the new nodes to find the existing cluster and start them up. Elasticsearch adds the new nodes to the voting configuration if it is appropriate to do so.
 
 During master election or when joining an existing formed cluster, a node sends a join request to the master in order to be officially added to the cluster.
 
-
 ### Removing master-eligible nodes [modules-discovery-removing-nodes]
 
 When removing master-eligible nodes, it is important not to remove too many all at the same time. For instance, if there are currently seven master-eligible nodes and you wish to reduce this to three, it is not possible simply to stop four of the nodes at once: to do so would leave only three nodes remaining, which is less than half of the voting configuration, which means the cluster cannot take any further actions.
@@ -129,4 +124,3 @@ DELETE /_cluster/voting_config_exclusions
 # to return to the voting configuration in the future.
 DELETE /_cluster/voting_config_exclusions?wait_for_removal=false
 ```
-
@@ -1,7 +1,30 @@
 # ECE maintenance
 
-% What needs to be done: Write from scratch
+Elastic Cloud Enterprise (ECE), being a self-managed Elastic Stack deployment platform, abstracts much of the complexity of running {{es}}, but still requires regular maintenance at both the platform and deployment levels. Maintenance activities range from managing individual deployments to performing infrastructure-level updates on ECE hosts.
 
-% GitHub issue: https://github.com/elastic/docs-projects/issues/353
+## Deployment maintenance and host infrastructure maintenance [ece-deployment-host-infra-maintenance]
 
-% Scope notes: Introduction about ECE maintenance and activities / actions. Explain the difference between deployments maintenance and ECE hosts infrastructure maintenance.
+Deployment maintenance focuses on managing individual {{es}} and {{kib}} instances within ECE. This includes actions such as pausing instances, stopping request routing to nodes, and moving instances between allocators to optimize resource usage or prepare for maintenance. These tasks help maintain service availability and performance without affecting the underlying infrastructure.
+
+ECE host infrastructure maintenance involves managing virtual machines that host ECE itself. This includes tasks like applying operating system patches, upgrading software, or decommissioning hosts. Infrastructure maintenance often requires more careful planning, as it can impact multiple deployments running on the affected hosts. Methods such as placing allocators into maintenance mode and redistributing workloads provide a smooth transition during maintenance operations.
+
+This section provides guidance on best practices for both types of maintenance, helping you maintain a resilient ECE environment.
+
+## Enabling Kibana [ece-manage-kibana]
+
+{{kib}} is an open source analytics and visualization platform designed to work with {{es}}, that makes it easy to perform advanced data analysis and to visualize your data in a variety of charts, tables, and maps. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to {{es}} queries in real time.
+
+Most deployment templates include a {{kib}} instance, but if it wasn’t part of the initial deployment you can go to the **{{kib}}** page and **Enable** {{kib}}.
+
+The new {{kib}} instance takes a few moments to provision. After provisioning {{kib}} is complete, you can use the endpoint URL to access {{kib}}.
+
+::::{tip}
+You can log into Kibana as the `elastic` superuser. The password was provided when you created your deployment or can be [reset](../users-roles/cluster-or-deployment-auth/built-in-users.md). On AWS and not able to access Kibana? [Check if you need to update your endpoint URL first](../../troubleshoot/deployments/cloud-enterprise/common-issues.md#ece-aws-private-ip).
+::::
+
+From the deployment **{{kib}}** page you can also:
+
+* Terminate your {{kib}} instance, which stops it. The information is stored in your {{es}} cluster, so stopping and restarting should not risk your {{kib}} information.
+* Restart it after stopping.
+* Upgrade your {{kib}} instance version if it is out of sync with your {{es}} cluster.
+* Delete to fully remove the instance, wipe it from the disk, and stop charges.
@@ -16,22 +16,18 @@ To delete hosts:
 
 1. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md).
 2. From the **Platform** menu, select **Hosts**.
-
-    Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
+   Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
 
 3. For hosts that hold the allocator role:
-
-    1. [Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
-    2. [Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation.
+   1. [Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
+   2. [Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation.
 
 4. Go to **Hosts** and select a host.
 5. Select **Manage roles** from the **Manage host** menu and remove all assigned roles.
 6. Select **Demote host** from the **Manage host** menu if present. If the **Delete host** option is already enabled, skip this step.
 7. Remove *all running* containers from the host, starting from the container with name `frc-runners-runner`. Then remove the storage directory (the default `/mnt/data/elastic/`). You can use the recommended [cleanup command](../../uninstall/uninstall-elastic-cloud-enterprise.md).  Upon doing so, the UI should reflect the host is **Disconnected**, allowing the host to be deleted.
 8. Select **Delete host** and confirm.
 
-::::{tip} 
+::::{tip}
 Refresh the page if the button isn’t active.
 ::::
-
-
@@ -7,12 +7,9 @@ mapped_pages:
 
 In some circumstances, you might need to temporarily restrict access to a node so you can perform corrective actions that might otherwise be difficult to complete. For example, if your cluster is being overwhelmed by requests because it is undersized for its workload, its nodes might not respond to efforts to resize.
 
-These actions act as a maintenance mode for cluster node. Performing these actions can stop the cluster from becoming completely unresponsive so that you can resolve operational issues much more effectively.
+These actions act as a maintenance mode for cluster node. Performing these actions can stop the cluster from becoming unresponsive so that you can resolve operational issues much more effectively.
 
 * [**Stop routing to the instance**](start-stop-routing-requests.md): Block requests from being routed to the cluster node. This is a less invasive action than pausing the cluster.
 * [**Pause an instance**](pause-instance.md): Suspend the node immediately by stopping the container that the node runs on without completing existing requests. This is a more aggressive action to regain control of an unresponsive node.
 
 As an alternative, to quickly add capacity to a deployment if it is unhealthy or at capacity, you can also [override the resource limit for a deployment](../../deploy/cloud-enterprise/resource-overrides.md).
-
-
-
@@ -12,15 +12,13 @@ To put an allocator into maintenance mode:
 1. [Log into the Cloud UI](../../deploy/cloud-enterprise/log-into-cloud-ui.md).
 2. From the **Platform** menu, select **Allocators**.
 3. Choose the allocator you want to work with and select **Enable Maintenance Mode**. Confirm the action.
-
-    Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
-
+   Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
 
 After the allocator enters maintenance mode, no new Elasticsearch nodes or Kibana instances will be started on the allocator. Existing nodes will continue to work as expected. You can now safely perform actions like [moving nodes off the allocator](move-nodes-instances-from-allocators.md).
 
 If you want to make the allocator fully active again, select **Disable Maintenance Mode**. Confirm the action.
 
-::::{tip} 
+::::{tip}
 If you need the existing instances to stop routing requests you can [stop routing requests](deployments-maintenance.md) to disable incoming requests to particular instances. You can also massively disable all allocator instances routing with the [allocator-toggle-routing-requests.sh](https://download.elastic.co/cloud/allocator-toggle-routing-requests.sh) script. The script runs with the following parameters in the form environment variables:
 
 * `API_URL` Url of the administration API.
@@ -41,5 +39,3 @@ AUTH_HEADER="Authorization: ApiKey $(cat ~/api.key)" API_URL="https://adminconso
 ```
 
 ::::
-
-
@@ -1,7 +1,45 @@
 # Maintenance activities
 
-% What needs to be done: Write from scratch
+Maintenance activities ensure the smooth operation and scalability of your {{es}} installation. This section provides guidelines on performing essential maintenance tasks while minimizing downtime and maintaining high availability.
 
-% GitHub issue: https://github.com/elastic/docs-projects/issues/353
+## Available maintenance operations
 
-% Scope notes: summarize the list of activites
+### Enable maintenance mode
+
+Before performing maintenance on an allocator, you should enable maintenance mode to prevent new Elasticsearch clusters and Kibana instances from being provisioned. This ensures that existing deployments can be safely moved to other allocators or adjusted without disruption.
+
+### Scale out installation
+
+You can scale out your installation by adding capacity to meet growing demand or improve high availability. This process involves installing ECE on additional hosts, assigning roles to new hosts, and resizing deployments to utilize the expanded resources.
+
+### Move nodes and instances betwwen allocators
+
+Moving {{es}} nodes, {{kib}} instances, and other components between allocators may be necessary to free up space, avoid downtime, or handle allocator failures. The process involves selecting target allocators and ensuring enough capacity to accommodate the migration.
+
+### Perform ECE host maintenance
+
+Maintaining ECE hosts is critical for applying system patches, performing hardware upgrades, and ensuring compliance with security standards. Different maintenance methods are available based on the level of disruption:
+
+* Disabling the Docker daemon (nondestructive): Temporarily disables a host while keeping it in the installation.
+
+* Deleting the host (destructive): Permanently removes a host, requiring reinstallation after maintenance.
+
+* Shutting down the host (less destructive): Temporarily shuts down a host while preserving configurations for planned outages.
+
+### Delete ECE hosts
+
+If a host is no longer required or is faulty, it can be removed from the Elastic Cloud Enterprise installation. Deleting a host only removes it from the installation but does not uninstall the software from the physical machine. Before deletion, allocators should be placed in maintenance mode, and nodes should be migrated to avoid disruption.
+
+## Best practices for maintenance
+
+* Always check available capacity before making changes.
+
+* Use maintenance mode to avoid unexpected disruptions.
+
+* Move nodes strategically to maintain high availability.
+
+* Perform maintenance during off-peak hours when possible.
+
+* Regularly review and optimize resource allocation.
+
+By following these guidelines, you can ensure the stability and efficiency of your environment while carrying out necessary maintenance activities.