Skip to content
Closed
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
e3ef7a6
updates
georgewallace Oct 25, 2024
87d6282
updates
georgewallace Oct 25, 2024
cbb3024
updating images
georgewallace Oct 25, 2024
4231658
setting up toc
georgewallace Oct 25, 2024
765cb44
correcting toc
georgewallace Oct 25, 2024
b5c46bf
doc fixes
georgewallace Oct 25, 2024
6763907
updates to docs for shard management
georgewallace Oct 25, 2024
6a20ece
removed whitespace from diagram
georgewallace Oct 25, 2024
7d7f584
updating note
georgewallace Oct 25, 2024
f4fe202
moved common content out of architectures
georgewallace Oct 25, 2024
d77a166
updates
georgewallace Oct 25, 2024
96bf4bf
moving more content
georgewallace Oct 25, 2024
7f26b92
fixing issues
georgewallace Oct 29, 2024
eecf439
updates
georgewallace Oct 29, 2024
e6bb795
updates
georgewallace Oct 29, 2024
1d642a2
updates
georgewallace Oct 29, 2024
c3ab084
updates
georgewallace Oct 29, 2024
de9669b
fixing hierarchy
georgewallace Oct 29, 2024
c6c9763
updates
georgewallace Oct 29, 2024
d72e8c6
updateS
georgewallace Oct 29, 2024
993ab70
updates
georgewallace Oct 29, 2024
8caa624
updates with new titles
georgewallace Oct 29, 2024
9422a2c
more updates
georgewallace Oct 29, 2024
bed222d
fixing index settings
georgewallace Oct 29, 2024
f200b99
multiple updates
georgewallace Oct 30, 2024
380bffd
copy edit updates from liam
georgewallace Oct 30, 2024
e7e356b
updates
georgewallace Oct 30, 2024
19a5723
fixing cloud architecture
georgewallace Oct 30, 2024
156e627
addressing Liams feedback
georgewallace Oct 30, 2024
9fefada
removing extra ref archs
georgewallace Nov 1, 2024
8fa0a4a
putting images back
georgewallace Nov 1, 2024
bcf58ef
fixing index
georgewallace Nov 1, 2024
bb9e29a
updates per brads feedback
georgewallace Nov 1, 2024
fdf514e
updates for brad
georgewallace Nov 1, 2024
93b38e2
updating rachitecture descriptions
georgewallace Nov 1, 2024
fe34296
updates
georgewallace Nov 5, 2024
e3b4fc3
updates
georgewallace Nov 8, 2024
61bafef
updates
georgewallace Nov 8, 2024
a15bac4
fixing issue with build
georgewallace Nov 8, 2024
009b89b
correcting ids
georgewallace Nov 11, 2024
f414e57
updates
georgewallace Nov 11, 2024
1f0b2bf
updates
georgewallace Nov 11, 2024
81c5ca7
updating imageS
georgewallace Nov 11, 2024
13f7eb7
updates
georgewallace Nov 11, 2024
5438d06
updates
georgewallace Nov 12, 2024
a5f4ca3
updates
georgewallace Nov 13, 2024
113d1ec
changes:
georgewallace Nov 13, 2024
c7fe1ce
removing index to make it just show architectures
georgewallace Nov 13, 2024
8a899aa
fixing
georgewallace Nov 13, 2024
cb11b06
updates
georgewallace Nov 13, 2024
2d58722
fixing hot image
georgewallace Nov 13, 2024
a51630d
adding coming soon
georgewallace Nov 13, 2024
00f5133
updates based on feedback
georgewallace Nov 22, 2024
8c4cfd4
updates
georgewallace Nov 25, 2024
f313798
updateS
georgewallace Dec 3, 2024
80e9e22
updateS
georgewallace Dec 3, 2024
c286cbd
Apply suggestions from code review
georgewallace Dec 4, 2024
4b92e1b
Apply suggestions from code review
georgewallace Dec 4, 2024
5269683
Apply suggestions from code review
georgewallace Dec 4, 2024
94b4c7a
Apply suggestions from code review
georgewallace Dec 4, 2024
658ad64
Apply suggestions from code review
georgewallace Dec 4, 2024
940b632
Update docs/reference/reference-architectures/hot-frozen.asciidoc
georgewallace Dec 4, 2024
2924efb
correcting typo
georgewallace Dec 6, 2024
e9c6794
updates
georgewallace Dec 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/reference/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ include::snapshot-restore/index.asciidoc[]

// reference

include::reference-architectures/index.asciidoc[]

include::rest-api/index.asciidoc[]

include::commands/index.asciidoc[]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
[[elastic-cloud-architecture]]
== Elastic Cloud Hot-Frozen Architecture for Time Series Data
++++
<titleabbrev>Architecture: Elastic Cloud hot-frozen</titleabbrev>
++++

The Hot-Frozen Elasticsearch cluster architecture is cost optimized for large time-series datasets while keeping all of the data **fully searchable**. There is no need to "re-hydrate" archived data. In this architecture, the hot tier is primarily used for indexing and immediate searching (1-3 days) with a majority of the search being handled by the frozen tier. Since the data is moved to searchable snapshots in an object store, the cost of keeping all of the data searchable is dramatically reduced.


TIP: This architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is built with sound recommendations.

The most important foundational step to any architecture is designing your deployment to be responsive to production workloads. For more information on planning for production, see https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html[Get ready for production].

[discrete]
[[cloud-hot-use-case]]
=== Use case

This architecture is intended for organizations that need to do the following:

* Monitor the performance and health of their applications in real time, including the creation and tracking of SLOs (Service Level Objectives).
* Provide insights and alerts to ensure optimal performance and quick issue resolution for applications.
* Apply machine learning and artificial intelligence to assist engineers and application teams in dealing with terabytes of new data per day.


[discrete]
[[cloud-hot-frozen-architecture]]
=== Architecture

image::images/elastic-cloud-architecture.png["An Elastic Cloud Architecture"]

[discrete]
[[cloud-hot-frozen-configuration]]
=== Example configuration

The following is a sample configuration with the following specifications:

* An ingest rate of 1TB/day
* 1 day in the hot tier
* 89 days in the frozen tier
* A total of 90 days of searchable data

[discrete]
[[cloud-hot-frozen-aws]]
==== AWS

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)

[discrete]
[[cloud-hot-frozen-azure]]
==== Azure

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)

[discrete]
[[cloud-hot-frozen-gcp]]
==== GCP

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)

[discrete]
[[cloud-hot-frozen-recommended-instance-types]]
==== Recommended instance types per cloud provider

The following table details our recommended node types for this architecture, based on the hardware configurations described previously.

For more details on these instance types, see our documentation on Elastic Cloud hardware for https://www.elastic.co/guide/en/cloud/current/ec-default-aws-configurations.html[AWS], https://www.elastic.co/guide/en/cloud/current/ec-default-azure-configurations.html[Azure], and https://www.elastic.co/guide/en/cloud/current/ec-default-gcp-configurations.html[GCP].

[cols="10, 30, 30, 30"]
|===
| *Type* | *AWS Instance/Type* | *Azure Instance/Type* | *GCP Instance/Type*
|image:images/hot.png["An Elastic Cloud Architecture"] | aws.es.datahot.c6gd
c6gd |azure.es.datahot.fsv2
f32sv2|gcp.es.datahot.n2.68x32x45


N2
|image:images/frozen.png["An Elastic Cloud Architecture"]
| aws.es.datafrozen.i3en


i3en
|
azure.es.datafrozen.edsv4


e8dsv4
|
gcp.es.datafrozen.n2.68x10x95


N2
|image:images/machine-learning.png["An Elastic Cloud Architecture"]
| aws.es.ml.m6gd


m6gd
|
azure.es.ml.fsv2


f32sv2
|
gcp.es.ml.n2.68x32x45


N2
|image:images/master.png["An Elastic Cloud Architecture"]
| aws.es.master.c6gd


c6gd
|
azure.es.master.fsv2


f32sv2
|
gcp.es.master.n2.68x32x45


N2
|image:images/kibana.png["An Elastic Cloud Architecture"]
| aws.kibana.c6gd


c6gd
|
azure.kibana.fsv2


f32sv2
|
gcp.kibana.n2.68x32x45


N2|
|===

[discrete]
[[cloud-hot-frozen-considerations]]
=== Important considerations

The following are important considerations for this architecture:

* Frozen tiers are read-only. Once data rolls over to the frozen tier, documents can no longer be updated. If you need to update documents for some part of the data lifecycle, you will need either:
** A larger https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html#hot-tier[hot tier], or

** A https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html#warm-tier[warm tier] to cover the time period needed for document updates.

* This architecture uses a Hot/Frozen architecture. If you require https://www.elastic.co/guide/en/security/current/about-rules.html[detection rule lookback] or complex dashboards you may need to leverage a https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html#cold-tier[cold tier].

[discrete]
[[cloud-architecture-limitations]]
=== Limitations of this architecture
* This architecture is not intended for Disaster Recovery, because it is deployed across Availability Zones in a single cloud region. To make this architecture disaster proof, add a second deployment in another cloud region. Learn more at, https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html#ccr-disaster-recovery[disaster recovery].

[discrete]
[[cloud-hot-frozen-resources]]
=== Resources and references
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
69 changes: 69 additions & 0 deletions docs/reference/reference-architectures/index.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
[[reference-architectures]]
= Reference architectures

Elasticsearch reference architectures are blueprints for deploying, managing, and optimizing Elasticsearch clusters tailored to different use cases. Whether you're handling logs, metrics, or sophisticated search applications, these reference architectures ensure scalability, reliability, and efficient resource utilization. Use these guidelines to deploy Elasticsearch for your use case, with minimal risk and complexity.

These architectures are designed by Elastic Solutions Architects to provide standardized, proven solutions that help users follow best practices when deploying Elasticsearch. Some of the key areas of focus are listed below.

* Infrastructure setup
* Data ingestion
* Indexing
* Search performance
* High availability

TIP: These architectures are specific to running your deployment on-premises or cloud. If you are using Elastic serverless your Elasticsearch clusters are autoscaled and fully managed by Elastic. For all the deployment options, see https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-intro-deploy.html[Run Elasticsearch].

These reference architectures are recommendations and should be adapted to fit your specific environment and needs. Each solution can vary based on the unique requirements and conditions of your deployment. In these architectures we discuss about how to deploy cluster components. For information about designing ingest architectures to feed content into your cluster, refer to https://www.elastic.co/guide/en/ingest/current/use-case-arch.html[Ingest architectures]

[discrete]
[[reference-architectures-time-series-2]]
=== Architectures

[cols="50, 50"]
|===
| *Architecture* | *When to use*
| <<elastic-cloud-architecture>>

This architecture is a Hot-Frozen architecture on the Elastic Cloud that is optimized for time series datasets while keeping all of the data fully searchable. It maintains index structures that allow for fast search in cloud object stores.

a|
* You want to use the best practices Elastic Cloud implements to run your cluster.
* You want to leverage cloud provider hardware that have been extensively tested.
* You need long retention periods with the ability to search indices in an object store cost-effectively.
* Use cloud provider's highly available object stores for data integrity so you don't have to depend on your own.


| <<multi-region-two-datacenter-architecture>>

A scalable and highly available architecture for Elasticsearch using two datacenters in separate geographical regions. Requests will be better served to a cluster that is geographically closer to you.

a|
* Allows you multiple options to scale effectively
* You need a seperate geographical region to help with network latency
* This is cost-effective if you have many small clusters to accomodate


| <<self-managed-single-datacenter>>

An architecture is designed to satisfy high availability requirements under normal processing as well as including high availability and resiliency during node maintenance or re-paving activities.

a|
* When you only have one datacenter available.
* When you are just getting started and there is no requirement for data resiliency in your project.

| <<three-availability-zones>>

The three zone architecture is used for rolling upgrades and failure domains locally because non-disruptive upgrades and other plans or unplanned outages are more sustainable with 33% of the resources off-line and 66% available

a|
* When you need an architecture that is resilient to unplanned outages
|
|===

include::multi-region-two-datacenter-architecture.asciidoc[]

include::elastic-cloud-architecture.asciidoc[]

include::self-managed-single-datacenter.asciidoc[]

include::three-availability-zones.asciidoc[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
[[multi-region-two-datacenter-architecture]]
== Self-managed hot-frozen multi-region architecture for time series data
++++
<titleabbrev>Architecture: Self-managed - two datacenter</titleabbrev>
++++

This article defines a scalable and highly available architecture for Elasticsearch using two datacenters in separate geographical regions. Having an Elasticsearch cluster in another separate geographical region helps with network latency. Requests will be better served to a cluster that is geographically closer to you.

TIP: This architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is built with sound recommendations.

The most important foundational step to any architecture is designing your deployment to be responsive to production workloads. For more information on planning for production, see https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html[Get ready for production].

[discrete]
[[multi-region-use-case]]
=== Use case

This architecture is intended for organizations that need to do the following:

* Monitor the performance and health of their applications in real time
* Provide insights and alerts to ensure optimal performance and quick issue resolution for applications

[discrete]
[[multi-region-architecture]]
=== Architecture

image::images/multi-region-two-datacenter.png["A multi-region time-series architecture across two datacenters"]

[discrete]
[[multi-region-configuration]]
=== Example configuration

The following is a sample configuration with the following specifications:

* An ingest rate of 1TB/day
* 1 day in the hot tier
* 89 days in the frozen tier
* A total of 90 days of searchable data

* Hot tier: 120G RAM (2 60G RAM node x 3 pods x 2 availability zones)
* Frozen tier: 120G RAM (1 60G RAM node x 3 pods x 2 availability zones)
* Machine learning: 128G RAM (1 64G node x 3 pods x 2 availability zones)
* Master nodes: 24G RAM (8G node x 3 pods x 2 availability zones)
* Kibana: 16G RAM (16G node x 3 pods x 2 availability zones)

[discrete]
[[multi-region-considerations]]
=== Important considerations

The following are important considerations for this architecture:

* Hot nodes contain both primary and replica shards. https://www.elastic.co/guide/en/elasticsearch/reference/8.15/modules-cluster.html#shard-allocation-awareness[Shard allocation awareness] should be configured to ensure primary and replica shards do not end up in the same pod.
* Machine learning nodes are optional, but highly recommended for large-scale time series use cases. The amount of data can quickly become overwhelming and difficult to analyze. Applying techniques like machine learning-based anomaly detection can help manage this data effectively.

[discrete]
[[multi-region-limitations]]
=== Limitations of this architecture
* No region resilience

[discrete]
[[multi-region-resources]]
=== Resources and references

Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
[[self-managed-single-datacenter]]
== Self-managed hot-frozen multi-region architecture for time series data
++++
<titleabbrev>Architecture: Self-managed - multi-region</titleabbrev>
++++

This architecture is designed to ensure high availability during normal operations and node maintenance. It reduces cost by leveraging the frozen tier as soon as it makes sense from an ingest and most frequently read documents perspective. It significantly reduces the likelihood of hot-spotting due to the sharding strategy. Additionally it eliminates network and disk overhead caused by rebalancing attempts that would occur during maintenance due to setting forced awareness. For more information on elements of this architecture, see https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-design-large-clusters.html#high-availability-cluster-design-two-zones[Resilience in larger clusters - Two-zone clusters].

This architecure provides the following benefits:

* Reduces cost by leveraging the Frozen tier as soon as that makes sense from an ingest and most frequently read documents perspective.
* Significantly reduces the likelihood of hot-spotting due to the sharding strategy.
* Eliminates network and disk overhead caused by rebalancing attempts that would occur during maintenance due to setting forced awareness.

TIP: This architecture includes all the necessary components of the Elastic Stack and is not intended for sizing workloads, but rather as a basis to ensure the architecture you deploy is built with sound recommendations.

The most important foundational step to any architecture is designing your deployment to be responsive to production workloads. For more information on planning for production, see https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html[Get ready for production].

[discrete]
[[single-datacenter-use-case]]
=== Use case

This architecture is intended for organizations that need to do the following:

* Store data that is written once and not updated including logs, metrics, or accounting ledgers where balance updates are done via additional offsetting entries.
* Be resilient to hardware failures.
* Ensure availability during operational maintenance of any given pod.

[discrete]
[[single-datacenter-architecture]]
=== Architecture

image::images/single-datacenter.png["A self hosted single datacenter deployment"]

[discrete]
[[single-datacenter-configuration]]
=== Example configuration

The following is a sample configuration with the following specifications:

* An ingest rate of 1TB/day
* 1 day in the hot tier
* 89 days in the frozen tier
* A total of 90 days of searchable data

* Hot tier: 120G RAM (2 60G RAM node x 2 pods)
* Frozen tier: 120G RAM (1 60G RAM node x 2 pods)
* Machine learning: 128G RAM (1 64G node x 2 pods)
* Master nodes: 24G RAM (8G node x 3 pods) - The master node in pod 3 is for voting only
* Kibana: 16G RAM (16G node x 2 availability zone)

[discrete]
[[single-datacenter-considerations]]
=== Important considerations

The following are important considerations for this architecture:

* You may require more than one copy of the most recently written data to be available. To achieve this, add data nodes to pod 3 and set the https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#create-a-sharding-strategy[sharding strategy] to 1 primary and 2 replicas.
* Machine learning nodes in this architecture are optional. If you choose to use machine learning nodes, deploy one per pod.

* Maintenance should be performed one pod at a time.

* A yellow cluster state is acceptable during maintenance. This is due to the replica shards being unassigned.

[discrete]
[[single-datacenter-limitations]]
=== Limitations of this architecture
* This design does not address cross region disaster recovery. To learn more, see https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-ccr.html[Cross-cluster replication]
* During maintenance windows, only a single copy of the latest data not yet captured in a snapshot is available.
* This design assumes the data is written once and not updated.

[discrete]
[[single-datacenter-resources]]

=== Resources and references
Loading