You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Multicluster has been one of the most requested ambient features — and as of Istio 1.27, it is available in alpha status!
9
+
Multicluster has been one of the most requested features of ambient -— and as of Istio 1.27, it is available in alpha status!
10
10
We sought to capture the benefits and avoid the complications of multicluster architectures using the same modular design that ambient users love.
11
-
While still in alpha, this release delivers the core functionality of a multicluster mesh and lays the groundwork for a full feature set in upcoming releases.
11
+
This release brings the core functionality of a multicluster mesh and lays the groundwork for a richer feature set in upcoming releases.
12
12
13
13
## The Power & Complexity of Multicluster
14
14
15
-
Multicluster architectures increase outage resilience, shrink your blast radius,
16
-
ease adoption of data residence policies, and simplify cost tracking.
15
+
Multicluster architectures increase outage resilience, shrink your blast radius, and scale across data centers.
17
16
That said, connecting multiple clusters poses connectivity, security, and operational challenges.
18
17
19
18
In a single Kubernetes cluster, every pod can directly connect to another pod via a unique pod IP or service VIP.
20
-
We lose these guarantees when we start thinking of multicluster architectures.
21
-
IP address spaces of different clusters might overlap.
22
-
Even if they didn't, nodes in one cluster may not know how to route traffic from one cluster to another (depending on how the underlying infrastructure is configured)
19
+
These guarantees break down in multicluster architectures;
20
+
IP address spaces of different clusters might overlap,
21
+
and even without overlap, the underlying infrastructure would need configuration to route cross-cluster traffic.
23
22
24
-
Establishing cross-cluster connectivity also presents security challenges.
25
-
Cross-cluster connectivity means that pod-to-pod traffic can leave cluster boundaries -- and that pods may accept connections from outside the cluster.
26
-
Without care, an attacker could connect to a vulnerable pod, or sniff unencrypted traffic.
23
+
Cross-cluster connectivity also presents security challenges.
24
+
Pod-to-pod traffic will leave cluster boundaries and pods need to accept connections from outside the cluster.
25
+
Without strong controls, an attacker could exploit a vulnerable pod, or intercept unencrypted traffic.
27
26
28
-
For a multicluster solution to be viable, it must at least securely connect clusters, and do so
29
-
through APIs that are simple enough to keep pace with ever-changing environments.
27
+
A multicluster solution must securely connect clusters and do so
28
+
through simple, declarative APIs that keep pace with dynamic environments.
30
29
31
-
## Key Components.
30
+
## Key components
32
31
33
32
Ambient multicluster extends ambient with new components and minimal APIs to
34
-
securely connect clusters using the same lightweight, modular architecture.
33
+
securely connect clusters using ambient's lightweight, modular architecture.
34
+
It builds on the namespace sameness model -- a service in namespace `foo` in one cluster is treated as the same logical service as `foo` in another --
35
+
so services keep their existing DNS names across clusters, allowing you to control cross-cluster communication without changing application code.
35
36
36
-
### East-West Gateways
37
+
### East-west gateways
37
38
38
-
Each cluster deploys an east-west gateway with a globally routable IP that acts as an entrypoint for cross-cluster communication.
39
-
The east-west gateways are configured using Gateway API and controlled by istiod.
40
-
A ztunnel communicates across clusters by connecting to the remote cluster's east-west gateway and sending the destination service FQDN.
41
-
The east-west gateway will then forward the connection to a cluster-local pod of its choosing.
42
-
As such, overlapping IP spaces are of no concern because we never directly address a pod in a remote cluster.
43
-
Ambient multicluster achieves cross-cluster connectivity without changes to cluster networking configuration.
44
-
We can achieve this connectivity using only ambient and declarative APIs.
45
-
There is no need to restart workloads, manage IP address spaces, or configure routing tables.
39
+
Each cluster has an east-west gateway with a globally routable IP acting as an entry point for cross-cluster communication.
40
+
A ztunnel connects to the remote cluster's east-west gateway, identifying the destination service by its namespaced name.
41
+
The gateway then load balances the connection to a local pod.
42
+
Using the gateway’s routable IP removes the need for inter-cluster routing configuration,
43
+
and addressing pods by namespaced name rather than IP eliminates issues with overlapping IP spaces.
44
+
Together, these design choices enable cross-cluster connectivity without changing cluster networking or restarting workloads,
45
+
even as clusters are added or removed.
46
46
47
47
### Double HBONE
48
48
49
-
Ambient Multicluster uses nested [HBONE](https://istio.io/latest/docs/ambient/architecture/hbone/) connections to secure traffic traversing cluster boundaries while preserving ambient's strong security.
50
-
An outer HBONE connects the source ztunnel to its east-west gateway while an inner HBONE tunnel extends the connection to the destination.
51
-
The outer HBONE connection encrypts cross-cluster traffic, encrypts the destination service FQDN, and allows the source ztunnel and east-west gateway to verify each other's identity.
52
-
The inner HBONE connection encrypts traffic end-to-end, which allows the source ztunnel and destination ztunnel to verify each other's identity.
53
-
Put together, the two HBONE layers stop unauthenticated access, protect against data sniffing, and allow identity verification at every step.
An outer HBONE connection encrypts traffic to the east-west gateway and allows the source ztunnel and east-west gateway to verify each other's identity.
51
+
An inner HBONE connection encrypts traffic end-to-end, which allows the source ztunnel and destination ztunnel to verify each other's identity.
54
52
At the same time, the HBONE layers allow ztunnel to effectively reuse cross-cluster connections, minimizing TLS handshakes.
55
53
56
-
One drawback is that we encrypt application data twice (once for the outer HBONE and once for the inner HBONE).
57
-
We found this to be an acceptable drawback because it allows us to stick with open standards, and we expect the extra encryption to be negligible compared to the cost of sending data across clusters.
Once we have securely connected our clusters, we enable cross-cluster communication for a service by marking it global.
64
-
When a service becomes global, istiod will configure east-west gateways to accept and route traffic destined to the global service.
65
-
Istiod will also read remote apiservers and configure ztunnel with the number of pods for the global service per remote cluster.
66
-
Ztunnel can then proxy traffic to the global service across clusters.
58
+
Marking a service global enables cross-cluster communication.
59
+
Istiod configures east-west gateways to accept and route global service traffic to local pods and
60
+
programs ztunnels to load balance global service traffic to remote clusters.
67
61
68
62
Mesh administrators define the label-based criteria for global services via the `ServiceScope` API,
69
-
and app developers opt into global behavior by labeling their services accordingly.
63
+
and app developers label their services accordingly.
70
64
The default `ServiceScope` is
71
65
72
66
{{< text yaml >}}
73
-
=======
74
67
serviceScopeConfigs:
75
68
- servicesSelector:
76
69
matchExpressions:
@@ -83,24 +76,24 @@ serviceScopeConfigs:
83
76
meaning that any service with the `istio.io/global=true` label is global.
84
77
Although the default value is straightforward, the API is flexible and can express complex conditions using a mix of ANDs and ORs.
85
78
86
-
By default, ztunnel will load balance traffic uniformly across all endpoints --even remote ones--, but this can be configured using the service's `trafficDistribution` field to only cross cluster boundaries when there are no local endpoints.
87
-
Thus, users have control over whether and when traffic crosses cluster boundaries.
79
+
By default, ztunnel load balances traffic uniformly across all endpoints --even remote ones--,
80
+
but is configurable through the service's `trafficDistribution` field to only cross cluster boundaries when there are no local endpoints.
81
+
Thus, users have control over whether and when traffic crosses cluster boundaries with no changes to application code.
88
82
89
-
## Limitations and Roadmap
83
+
## Limitations and roadmap
90
84
91
-
Although the current implementation of ambient multicluster has the foundational features for a multicluster implementation,
85
+
Although the current implementation of ambient multicluster has the foundational features for a multicluster solution,
92
86
there is still a lot of work to be done.
87
+
We are looking to improve the following areas
93
88
94
-
For example, currently, we require that global services, attached waypoints, and `ServiceScope` configuration have uniform configuration across all clusters.
95
-
This greatly simplified our alpha implementation, but we are looking to allow for more configuration skew.
89
+
* Service and waypoint configuration must be uniform across all clusters.
90
+
* No cross-cluster L7 failover (L7 policy is applied at the destination cluster).
91
+
* No support for direct pod addressing or headless services.
92
+
* Support only for multi-primary deployment model.
93
+
* Support only for one network per cluster deployment model.
96
94
97
-
Similarly, waypoints and L7 policy enforcement have proven difficult since different clusters might have different policies.
98
-
In our alpha implementation, if a service has a waypoint, it will go through said waypoint in the destination cluster.
99
-
This reduces unexpected surprises by enforcing the destination cluster's L7 authorization policy, but remove the ability to perform L7 cross-cluster failover.
100
-
Eventually, we would like to apply L7 policy in both the source and destination cluster.
101
95
We are also looking to improve our reference documentation, guides, testing, and performance.
102
-
Currently, we only support a multi-primary deployment model with a single network per cluster, but would eventually want to support other cluster and network models.
103
96
104
97
If you would like to try out ambient multicluster, please follow [this guide](TODO).
105
-
Since many details are in discussion, we would love to hear any of your thoughts, comments, and use cases.
106
-
You can find ways to reach us on the [Istio community page](https://istio.io/latest/about/community/).
98
+
We would love to hear any of your thoughts, comments, and use cases.
99
+
You can find ways to reach us on [Github](https://github.com/istio/istio) or [Slack](https://istio.slack.com/).
0 commit comments