You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: deploy-manage/deploy/elastic-cloud/cloud-hosted.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,7 +106,7 @@ Of course, you can choose to follow your own path and use Elastic components ava
106
106
107
107
**Adjust the capacity and capabilities of your deployments for production**
108
108
109
-
There are a few things that can help you make sure that your production deployments remain available, healthy, and ready to handle your data in a scalable way over time, with the expected level of performance. Check [](/deploy-manage/production-guidance/plan-for-production-elastic-cloud.md).
109
+
There are a few things that can help you make sure that your production deployments remain available, healthy, and ready to handle your data in a scalable way over time, with the expected level of performance. Check [](/deploy-manage/production-guidance.md).
Copy file name to clipboardExpand all lines: deploy-manage/deploy/elastic-cloud/create-an-elastic-cloud-hosted-deployment.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,7 +65,7 @@ You can also create a deployment using the [Elastic Cloud API](https://www.elast
65
65
66
66
To make sure you’re all set for production, consider the following actions:
67
67
68
-
*[Plan for your expected workloads](/deploy-manage/production-guidance/plan-for-production-elastic-cloud.md) and consider how many availability zones you’ll need.
68
+
*[Plan for your expected workloads](/deploy-manage/production-guidance.md) and consider how many availability zones you’ll need.
69
69
*[Create a deployment](/deploy-manage/deploy/elastic-cloud/create-an-elastic-cloud-hosted-deployment.md) on the region you need and with a hardware profile that matches your use case.
70
70
*[Change your configuration](/deploy-manage/deploy/elastic-cloud/ec-customize-deployment-components.md) by turning on autoscaling, adding high availability, or adjusting components of the Elastic Stack.
71
71
*[Add extensions and plugins](/deploy-manage/deploy/elastic-cloud/add-plugins-extensions.md) to use Elastic supported extensions or add your own custom dictionaries and scripts.
Copy file name to clipboardExpand all lines: deploy-manage/production-guidance.md
+8-15Lines changed: 8 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,12 +10,6 @@ applies_to:
10
10
eck: all
11
11
self: all
12
12
---
13
-
$$$ec-best-practices-data$$$
14
-
15
-
% start bringing https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html here
16
-
17
-
% try to merge https://www.elastic.co/guide/en/cloud/current/ec-planning.html and https://www.elastic.co/guide/en/cloud/current/ec-best-practices-data.html
18
-
% mention all deployment types! what the user needs to be aware for orchestrated deployments.
19
13
20
14
# Production guidance
21
15
@@ -35,24 +29,27 @@ By following this guidance, you can ensure your {{stack}} deployment is robust,
35
29
36
30
## Deployment types
37
31
38
-
Production guidelines described in this section apply to all [deployment types](/deploy-manage/deploy.md#choosing-your-deployment-type)-including {{ech}}, {{ece}}, {{eck}}, and self-managed clusters-**except** {{serverless-full}}. However, certain parts may be relevant only to self-managed clusters, as orchestration systems automate some of the configurations discussed here.
32
+
Production guidelines and concepts described in this section apply to all [deployment types](/deploy-manage/deploy.md#choosing-your-deployment-type)-including {{ech}}, {{ece}}, {{eck}}, and self-managed clusters-**except** {{serverless-full}}.
39
33
40
-
Check the headers of each document or section to confirm whether the content applies to your deployment type.
34
+
However, certain parts may be relevant only to self-managed clusters, as orchestration systems automate some of the configurations discussed here. Check the headers of each document or section to confirm whether the content applies to your deployment type.
41
35
42
36
::::{note}
43
-
**{{serverless-full}}** projects are fully managed and automatically scaled by Elastic. Your project’s performance and general data retention are controlled by the [Search AI Lake settings](/deploy/elastic-cloud/project-settings.md#elasticsearch-manage-project-search-ai-lake-settings).
37
+
**{{serverless-full}}** projects are fully managed and automatically scaled by Elastic. Your project’s performance and general data retention are controlled by the [Search AI Lake settings](/deploy-manage/deploy/elastic-cloud/project-settings.md#elasticsearch-manage-project-search-ai-lake-settings).
44
38
::::
45
39
46
40
## Section overview
47
41
42
+
This section is divided into {{es}} and {{kib}} production-ready concepts and best practices.
Other sections of the documentation provide important guidance for running {{stack}} applications in production.
@@ -85,7 +78,7 @@ Other sections of the documentation provide important guidance for running {{sta
85
78
86
79
### Security and monitoring [security-and-monitoring]
87
80
88
-
As with any enterprise system, you need tools to secure, manage, and monitor your deployments. Security, monitoring, and administrative features that are integrated into {{es}} enable you to use [Kibana](../../get-started/the-stack.md) as a control center for managing a cluster.
81
+
As with any enterprise system, you need tools to secure, manage, and monitor your deployments. Security, monitoring, and administrative features that are integrated into {{es}} enable you to use [Kibana](/get-started/the-stack.md) as a control center for managing a cluster.
89
82
90
83
[Learn about securing an {{es}} cluster](./security.md).
# Availability and resilience [high-availability-cluster-design]
13
+
# Design for resilience [high-availability-cluster-design]
7
14
8
15
Distributed systems like {{es}} are designed to keep working even if some of their components have failed. As long as there are enough well-connected nodes to take over their responsibilities, an {{es}} cluster can continue operating normally if some of its nodes are unavailable or disconnected.
9
16
17
+
{{es}} implements high availability at three key levels:
18
+
19
+
* Node level – Running multiple nodes within the cluster to avoid single points of failure and maintain operational stability.
20
+
* Cluster level – Ensuring redundancy by distributing nodes across availability zones to prevent failures from affecting the entire cluster.
21
+
* Index level – Configuring shard replication to protect against data loss and improve search performance by distributing queries across multiple copies.
22
+
23
+
Each of these HA mechanisms contributes to {{es}}’s resilience and scalability. The appropriate strategy depends on factors such as data criticality, query patterns, and infrastructure constraints. It is up to you to determine the level of resiliency and high availability that best fits your use case. This sections provides detailed guidance on designing a production-ready {{es}} deployment that balances availability, performance, and scalability.
24
+
25
+
## Cluster sizes
26
+
10
27
There is a limit to how small a resilient cluster can be. All {{es}} clusters require the following components to function:
11
28
12
29
* One [elected master node](../distributed-architecture/discovery-cluster-formation/modules-discovery-quorums.md)
@@ -31,6 +48,4 @@ Depending on your needs and budget, an {{es}} cluster can consist of a single no
31
48
32
49
*[Resilience in small clusters](availability-and-resilience/resilience-in-small-clusters.md)
33
50
*[Resilience in larger clusters](availability-and-resilience/resilience-in-larger-clusters.md)
34
-
35
-
36
-
51
+
*[Resilience in {{ech}} deployments](./availability-and-resilience/resiliente-in-ech.md)
navigation_title: Resilience in Elastic Cloud Hosted deployments
3
+
applies_to:
4
+
deployment:
5
+
ess: all
6
+
---
7
+
8
+
# Resiliency in Elastic Cloud Hosted deployments [ec-ha]
9
+
10
+
With {{ech}}, your deployment can be spread across up to three separate availability zones, each hosted in an isolated infrastructure domain, such as separate data centers in the case of {{ech}}.
11
+
12
+
::::{note}
13
+
While this document focuses on {{ech}}, the concepts also apply to {{ece}} and {{eck}} deployments. In ECK, you will have to manually configure [availability zone distribution and node scheduling](/deploy-manage/deploy/cloud-on-k8s/advanced-elasticsearch-node-scheduling.md) through your Kubernetes platform.
14
+
::::
15
+
16
+
Why this matters:
17
+
18
+
* Data centers can have issues with availability. Internet outages, earthquakes, floods, or other events could affect the availability of a single data center. With a single availability zone, you have a single point of failure that can bring down your deployment.
19
+
* Multiple availability zones help your deployment remain available. This includes your {{es}} cluster, provided that your cluster is sized so that it can sustain your workload on the remaining data centers and that your indices are configured to have at least one replica.
20
+
* Multiple availability zones enable you to perform changes to resize your deployment with zero downtime.
21
+
22
+
### Recommendations
23
+
24
+
We recommend that you use at least two availability zones for production and three for mission-critical systems. Just one zone might be sufficient, if your {{es}} cluster is mainly used for testing or development and downtime is acceptable, but should never be used for production.
25
+
26
+
With multiple {{es}} nodes in multiple availability zones you have the recommended hardware, the next thing to consider is having the recommended index replication. Each index, with the exception of searchable snapshot indexes, should have one or more replicas. Use the index settings API to find any indices with no replica:
27
+
28
+
```sh
29
+
GET _all/_settings/index.number_of_replicas
30
+
```
31
+
32
+
Moreover, a high availability (HA) cluster requires at least three master-eligible nodes. For clusters that have fewer than six {{es}} nodes, any data node in the hot tier will also be a master-eligible node. You can achieve this by having hot nodes (serving as both data and master-eligible nodes) in three availability zones, or by having data nodes in two zones and a tiebreaker (will be automatically added if you choose two zones). For clusters that have six {{es}} nodes and beyond, dedicated master-eligible nodes are introduced. When your cluster grows, consider separating dedicated master-eligible nodes from dedicated data nodes. We recommend using at least 4GB RAM for dedicated master nodes.
33
+
34
+
The data in your {{es}} clusters is also backed up every 30 minutes, 4 hours, or 24 hours, depending on which snapshot interval you choose. These regular intervals provide an extra level of redundancy. We do support [snapshot and restore](../../../deploy-manage/tools/snapshot-and-restore.md), regardless of whether you use one, two, or three availability zones. However, with only a single availability zone and in the event of an outage, it might take a while for your cluster come back online. Using a single availability zone also leaves your cluster exposed to the risk of data loss, if the backups you need are not useable (failed or partial snapshots missing the indices to restore) or no longer available by the time that you realize that you might need the data (snapshots have a retention policy).
35
+
36
+
### Important considerations
37
+
38
+
::::{warning}
39
+
* Clusters that use only one availability zone are not highly available and are at risk of data loss. To safeguard against data loss, you must use at least two availability zones.
40
+
* Indices with no replica, except for searchable snapshot indices, are not highly available. You should use replicas to mitigate against possible data loss.
41
+
* Clusters that only have one master node are not highly available and are at risk of data loss. You must have three master-eligible nodes.
# Getting ready for production (Elasticsearch) [scalability]
13
+
# Getting ready for production [scalability]
7
14
8
-
Many teams rely on {{es}} to run their key services. To keep these services running, you can design your {{es}} deployment to keep {{es}} available, even in case of large-scale outages. To keep it running fast, you also can design your deployment to be responsive to production workloads.
15
+
Many teams rely on {{es}} to run their key services. To keep these services running, you can [design your {{es}} deployment for resilience](./availability-and-resilience.md), to keep {{es}} available even in case of large-scale outages. To keep it running fast, you can also design your deployment with [performance optimizations](./optimize-performance.md) that make it responsive to production workloads.
9
16
10
-
{{es}} is built to be always available and to scale with your needs. It does this using a distributed architecture. By distributing your cluster, you can keep Elastic online and responsive to requests.
17
+
{{es}} is built to be always available and to scale with your needs. It does this using a [distributed architecture](/deploy-manage/distributed-architecture.md). By distributing your cluster, you can keep Elastic online and responsive to requests.
11
18
12
-
In case of failure, {{es}} offers tools for cross-cluster replication and cluster snapshots that can help you fall back or recover quickly. You can also use cross-cluster replication to serve requests based on the geographic location of your users and your resources.
19
+
In case of failure, {{es}} offers tools for [cross-cluster replication](../tools/cross-cluster-replication.md) and [cluster snapshots](../tools/snapshot-and-restore.md) that can help you fall back or recover quickly. You can also use cross-cluster replication to serve requests based on the geographic location of your users and your resources.
13
20
14
21
{{es}} also offers security and monitoring tools to help you keep your cluster highly available.
15
22
16
-
17
23
## Use multiple nodes and shards [use-multiple-nodes-shards]
18
24
19
25
When you move to production, you need to introduce multiple nodes and shards to your cluster. Nodes and shards are what make {{es}} distributed and scalable. The size and number of these nodes and shards depends on your data, your use case, and your budget.
@@ -37,34 +43,21 @@ CCR provides a way to automatically synchronize indices from your primary cluste
37
43
38
44
You can also use CCR to create secondary clusters to serve read requests in geo-proximity to your users.
39
45
40
-
Learn more about [cross-cluster replication](../tools/cross-cluster-replication.md) and about [designing for resilience](availability-and-resilience.md).
46
+
Learn more about [cross-cluster replication](../tools/cross-cluster-replication.md) and about [designing for resilience](./availability-and-resilience.md).
41
47
42
48
::::{tip}
43
49
You can also take [snapshots](../tools/snapshot-and-restore.md) of your cluster that can be restored in case of failure.
44
-
45
50
::::
46
51
47
-
48
-
49
-
## Security and monitoring [security-and-monitoring]
50
-
51
-
As with any enterprise system, you need tools to secure, manage, and monitor your {{es}} clusters. Security, monitoring, and administrative features that are integrated into {{es}} enable you to use [Kibana](../../get-started/the-stack.md) as a control center for managing a cluster.
52
-
53
-
[Learn about securing an {{es}} cluster](../security.md).
54
-
55
-
[Learn about monitoring your cluster](../monitor.md).
56
-
57
-
58
-
## Cluster design [cluster-design]
59
-
% moved to landing page.
52
+
## Cluster design and performance optimizations [cluster-design]
60
53
61
54
{{es}} offers many options that allow you to configure your cluster to meet your organization’s goals, requirements, and restrictions. You can review the following guides to learn how to tune your cluster to meet your needs:
62
55
63
-
*[Designing for resilience](availability-and-resilience.md)
64
-
*[Tune for indexing speed](optimize-performance/indexing-speed.md)
65
-
*[Tune for search speed](optimize-performance/search-speed.md)
66
-
*[Tune for disk usage](optimize-performance/disk-usage.md)
67
-
*[Tune for time series data](../../manage-data/use-case-use-elasticsearch-to-manage-time-series-data.md)
56
+
::::{note}
57
+
In orchestrated deployments, some of the settings mentioned in the referenced documents may not apply. Check the section headers to determine whether a topic is relevant to your deployment type.
58
+
::::
68
59
69
-
Many {{es}} options come with different performance considerations and trade-offs. The best way to determine the optimal configuration for your use case is through [testing with your own data and queries](https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing).
60
+
*[Designing for resilience](./availability-and-resilience.md)
Many {{es}} options come with different performance considerations and trade-offs. The best way to determine the optimal configuration for your use case is through [testing with your own data and queries](https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing).
0 commit comments