Skip to content

Commit f1ca494

Browse files
committed
Another round of edits
1 parent 7478a97 commit f1ca494

File tree

1 file changed

+46
-38
lines changed
  • content/en/blog/2025/ambient-multicluster

1 file changed

+46
-38
lines changed

content/en/blog/2025/ambient-multicluster/index.md

Lines changed: 46 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -6,59 +6,67 @@ attribution: Jackie Maertens (Microsoft), Keith Mattix (Microsoft), Mikhail Krin
66
keywords: [ambient,multicluster]
77
---
88

9-
Multicluster has been one of the most requested Ambient features — and as of Istio 1.27, it's now available.
9+
Multicluster has been one of the most requested ambient features — and as of Istio 1.27, it's now available.
1010
We sought to capture the benefits and avoid the complications of multicluster architectures using the same modular design that ambient users love.
1111
While still in alpha, this release delivers the core functionality of a multicluster mesh and lays the groundwork for a full feature set in upcoming releases.
1212

1313
## Multicluster's Many Benefits (and Challenges)
1414

15-
Multicluster architectures increase outage resilience, shrink the blast radiuses,
15+
Multicluster architectures increase outage resilience, shrink the blast radii,
1616
ease adoption of data residence policies, and simplify cost tracking.
1717
That said, integrating multiple clusters poses connectivity, security, and operation hurdles.
1818

19-
In a single Kubernetes cluster, every pod can directly connect to another pod via a pod IP or service VIP.
20-
However, in a multicluster deployment, there is no guarantee that the IP address spaces of different clusters are disjoint.
21-
Even if the spaces were disjoint, users would need to configure routing tables to route traffic from one cluster to another.
19+
In a single Kubernetes cluster, every pod can directly connect to another pod via a unique pod IP or service VIP.
20+
We lose these guarantees when we start thinking of multicluster architectures.
21+
IP address spaces of different clusters might overlap.
22+
Even if they didn't, nodes in one cluster would not know how to route traffic from one cluster to another.
23+
24+
Establishing cross-cluster connectivity also presents security challenges.
2225
Cross-cluster connectivity means that pod-to-pod traffic can leave cluster boundaries -- and that pods may accept connections from outside the cluster.
2326
Without care, an attacker could connect to a vulnerable pod, or sniff unencrypted traffic.
24-
All of this must be orchestrated through APIs that are both secure and simple enough to keep pace with ever-changing environments.
27+
28+
For a multicluster solution to be viable, it must at least securely connect clusters, and do so
29+
through APIs that are simple enough to keep pace with ever-changing environments.
2530

2631
## Key Components.
2732

2833
Ambient multicluster extends ambient with new components and minimal APIs to
29-
securely connect clusters using the same lightweight, modular architecture of ambient.
34+
securely connect clusters using the same lightweight, modular architecture.
3035

3136
### East-West Gateways
3237

33-
Each cluster deploys an East-West gateway with a globally routable IP that acts as an entrypoint for cross cluster communication.
34-
A ztunnel communicates across clusters by connecting to the east-west gateway and sending the destination service FQDN.
35-
The east-west gateway will then forward the connection to a cluster-local pod of its choosing.
36-
As such, we do not need to worry about overlapping IP spaces because we never directly address a pod in a remote cluster.
37-
Ambient multicluster achieves cross-cluster connectivity without changes to cluster connectivity.
38-
38+
Each cluster deploys an east-west gateway with a globally routable IP that acts as an entrypoint for cross-cluster communication.
3939
The east-west gateways are configured using GatewayAPI and controlled by istiod.
40-
By using these ambient and declarative APIs, there is no need to restart workloads, manage IP address spaces, or configure routing tables.
40+
A ztunnel communicates across clusters by connecting to the remote cluster's east-west gateway and sending the destination service FQDN.
41+
The east-west gateway will then forward the connection to a cluster-local pod of its choosing.
42+
As such, overlapping IP spaces are of no concern because we never directly address a pod in a remote cluster.
43+
Ambient multicluster achieves cross-cluster connectivity without changes to cluster networking configuration.
44+
We can achieve this connectivity using only ambient and declarative APIs.
45+
There is no need to restart workloads, manage IP address spaces, or configure routing tables.
4146

4247
### Double HBONE
4348

44-
Ambient Multicluster uses nested [HBONE](https://istio.io/latest/docs/ambient/architecture/hbone/) connections to secure traffic traversing cluster boundaries to extend ambient's strong security.
45-
An outer HBONE connects the source ztunnel to its the east-west gateway while an inner HBONE tunnel extends the outer the connection to the destination.
46-
The outer HBONE connection encrypts cross cluster traffic, encrypts the destination service FQDN, and allows the east-west gateway to verify the source's identity.
47-
The inner HBONE connection encrypts traffic end-to-end, allowing for identity verification of the destination pod.
48-
Put together, the two HBONE layers stop unauthenticated access, protect against data sniffing, and still allow ztunnel to verify the destination’s identity.
49-
At the same time, it allows ztunnel to effectively reuse cross cluster connections, minimizing TLS handshakes.
49+
Ambient Multicluster uses nested [HBONE](https://istio.io/latest/docs/ambient/architecture/hbone/) connections to secure traffic traversing cluster boundaries while preserving ambient's strong security.
50+
An outer HBONE connects the source ztunnel to its east-west gateway while an inner HBONE tunnel extends the connection to the destination.
51+
The outer HBONE connection encrypts cross-cluster traffic, encrypts the destination service FQDN, and allows the source ztunnel and east-west gateway to verify each other's identity.
52+
The inner HBONE connection encrypts traffic end-to-end, which allows the source ztunnel and destination ztunnel to verify each other's identity.
53+
Put together, the two HBONE layers stop unauthenticated access, protect against data sniffing, and allow identity verification at every step.
54+
At the same time, the HBONE layers allow ztunnel to effectively reuse cross-cluster connections, minimizing TLS handshakes.
5055

51-
The one drawback is that we encrypt application data twice (once for the outer HBONE and once for the inner HBONE).
56+
One drawback is that we encrypt application data twice (once for the outer HBONE and once for the inner HBONE).
5257
We found this to be an acceptable drawback because it allows us to stick with open standards, and we expect the extra encryption to be negligible compared to the cost of sending data across clusters.
5358

54-
{{< image link="./mc-ambient-traffic-flow.png" caption="Istio Ambient Multicluster traffic Flow" >}}
59+
{{< image link="./mc-ambient-traffic-flow.png" caption="Istio ambient multicluster traffic flow" >}}
60+
61+
### Service discovery and scope
5562

56-
### ServiceScope API
63+
Once we have securely connected our clusters, we enable cross-cluster communication for a service by marking it global.
64+
When a service becomes global, istiod will configure east-west gateways to accept and route traffic destined to the global service.
65+
Istiod will also read remote apiservers and configure ztunnel with the number of pods for the global service per remote cluster.
66+
Ztunnel can then proxy traffic to the global service across clusters.
5767

58-
Once clusters are securely connected, marking services as global to allow cross cluster communication,
59-
the `ServiceScope` API allows mesh administrators to mark which combinations of labels make a service global,
60-
and app developers can label their services accordingly.
61-
A global service is one has endpoints in all clusters and can be accessed from any cluster.
68+
Mesh administrators define the label-based criteria for global services via the `ServiceScope` API,
69+
and app developers opt into global behavior by labeling their services accordingly.
6270
The default `ServiceScope` is
6371

6472
{{< text yaml >}}
@@ -75,24 +83,24 @@ serviceScopeConfigs:
7583
meaning that any service with the `istio.io/global=true` label is global.
7684
Although the default value is straightforward, the API is flexible and can express complex conditions using a mix of ANDs and ORs.
7785

78-
By default, ztunnel will load balance traffic uniformly across clusters, but this can be configured using the service's `trafficDistribution` field to only reach across clusters when there are no local endpoints.
79-
Thus users have control over whether and when traffic crosses cluster boundaries.
86+
By default, ztunnel will load balance traffic uniformly across all endpoints --even remote ones--, but this can be configured using the service's `trafficDistribution` field to only cross cluster boundaries when there are no local endpoints.
87+
Thus, users have control over whether and when traffic crosses cluster boundaries.
8088

8189
## Limitations and Roadmap
8290

83-
Although the current implementation of ambient multicluster has strong security and the basic feature set of a multicluster product,
91+
Although the current implementation of ambient multicluster has the foundational features for a multicluster implementation,
8492
there is still a lot of work to be done.
8593

86-
For example, currently, we require that global services, attached waypoints, and serviceScope configuration have uniform configuration across all clusters.
87-
Although this greatly simplified our alpha implementation, we are looking to increase flexibility by allowing for more configuration skew.
94+
For example, currently, we require that global services, attached waypoints, and `ServiceScope` configuration have uniform configuration across all clusters.
95+
This greatly simplified our alpha implementation, but we are looking to allow for more configuration skew.
8896

89-
Similarly, waypoints and L7 policy enforcement have proven difficult since different clusters might have different policy.
97+
Similarly, waypoints and L7 policy enforcement have proven difficult since different clusters might have different policies.
9098
In our alpha implementation, if a service has a waypoint, it will go through said waypoint in the destination cluster.
91-
This reduces unexpected surprises by enforcing the destination cluster's L7 authorization policy, but does take away the ability to perform L7 cross-cluster failover.
92-
Eventually, we would like to also apply L7 policy in the source cluster, but this is not yet implemented.
93-
94-
We are also looking to improve our reference documentation, guides, testing, and performance as well as thinking about deployment models other than multi-primary.
99+
This reduces unexpected surprises by enforcing the destination cluster's L7 authorization policy, but remove the ability to perform L7 cross-cluster failover.
100+
Eventually, we would like to apply L7 policy in both the source and destination cluster.
101+
We are also looking to improve our reference documentation, guides, testing, and performance.
102+
Currently, we only support a multi-primary deployment model with a single network per cluster, but would eventually want to support other cluster and network models.
95103

96-
If you would like to try out Ambient Multicluster, please follow [this guide](TODO).
104+
If you would like to try out ambient multicluster, please follow [this guide](TODO).
97105
Since many details are in discussion, we would love to hear any of your thoughts, comments, and use cases.
98106
You can find ways to reach us on the [Istio community page](https://istio.io/latest/about/community/).

0 commit comments

Comments
 (0)