You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some set of clusters working together and need a way to uniquely identify them within the system that I use to track membership, or determine if a given cluster is in a ClusterSet.
269
274
275
+
_For example, SIG-Cluster-Lifecycle's Cluster API subproject uses a management cluster to deploy resources to member workload clusters, but today member workload clusters do not have a way to identify their own management cluster or any interesting metadata about it, such as what cloud provider it is hosted on._
276
+
270
277
#### Joining or moving between ClusterSets
271
278
272
279
I want the ability to add a previously-isolated cluster to a ClusterSet, or to move a cluster from one ClusterSet to another and be aware of this change.
@@ -275,23 +282,25 @@ I want the ability to add a previously-isolated cluster to a ClusterSet, or to m
275
282
276
283
I have a headless multi-cluster service deployed across clusters in my ClusterSet with similarly named pods in each cluster. I need a way to disambiguate each backend pod via DNS.
277
284
278
-
```
279
-
<<[UNRESOLVED]>>
280
-
Examples of DNS using cluster ID a la `<hostname>.<clusterID>.<svc>.<ns>.svc.clusterset.local`
281
-
<<[/UNRESOLVED]>>
282
-
```
285
+
_For example, an exported headless service of services name `myservice` in namespace `test`, backed by pods in two clusters with clusterIDs `clusterA` and `clusterB`, could be disambiguated by different DNS names following the pattern `<clusterID>.<svc>.<ns>.svc.clusterset.local`: `clusterA.myservice.test.svc.clusterset.local.` and `clusterB.myservice.test.svc.clusterset.local.`. This way the user can implement whatever load balancing they want (as is usually the case with headless services) by targeting each cluster's available backends directly._
283
286
284
287
#### Diagnostics
285
288
286
289
Clusters within my ClusterSet send logs/metrics to a common monitoring solution and I need to be able to identify the cluster from which a given set of events originated.
287
290
291
+
#### Multi-tenant controllers
292
+
293
+
My controller interacts with multiple clusters and needs to disambiguate between them to process its business logic.
294
+
295
+
_For example, [CAPN's virtualcluster project](https://github.com/kubernetes-sigs/cluster-api-provider-nested) is implementing a multi-tenant scheduler that schedules tenant namespaces only in certain parent clusters, and a separate syncer running in each parent cluster controller needs to compare the name of the parent cluster to determine whether the namespace should be synced. ([ref](https://github.com/kubernetes/enhancements/issues/2149#issuecomment-768486457))._
296
+
288
297
289
298
### `ClusterClaim` CRD
290
-
```
291
-
<<[UNRESOLVED]>>
292
-
The actual name of the CRD is not finalized and is provisionally titled `ClusterClaim` for the remainder of this document.
293
-
<<[/UNRESOLVED]>>
294
-
```
299
+
```
300
+
<<[UNRESOLVED]>>
301
+
The actual name of the CRD is not finalized and is provisionally titled `ClusterClaim` for the remainder of this document.
302
+
<<[/UNRESOLVED]>>
303
+
```
295
304
296
305
The `ClusterClaim` resource provides a way to store identification related, cluster scoped information for multi-cluster tools while creating flexibility for implementations. A cluster may have multiple `ClusterClaim`s, each holding a different identification related value. Each claim contains the following information:
297
306
@@ -349,19 +358,6 @@ Contains a unique identifier for the containing cluster.
349
358
350
359
**Reusing cluster names**: Since an `id.k8s.io ClusterClaim` has no restrictions on whether or not a ClusterClaim can be repeatable, if a cluster unregisters from a ClusterSet it is permitted under this standard to rejoin later with the same `id.k8s.io ClusterClaim` it had before. Similarly, a *different* cluster could join a ClusterSet with the same `id.k8s.io ClusterClaim` that had been used by another cluster previously, as long as both do not have membership in the same ClusterSet at the same time. Finally, two or more clusters may have the same `id.k8s.io ClusterClaim` concurrently (though they **should** not; see "Uniqueness" above) *as long as* they both do not have membership in the same ClusterSet.
351
360
352
-
```
353
-
<<[UNRESOLVED]>>
354
-
We could probably use some example scenarios describing how some dependent tool should handle IDs changing or disappearing during various stages of the cluster/membership lifecycle - @jeremyot
355
-
<<[/UNRESOLVED]>>
356
-
```
357
-
358
-
```
359
-
<<[UNRESOLVED]>>
360
-
How should uniqueness restrictions be handled to clusters who were originally isolated and only acquired/verified an id when joining a ClusterSet
361
-
<<[/UNRESOLVED]>>
362
-
```
363
-
364
-
365
361
#### Claim: `clusterset.k8s.io`
366
362
367
363
Contains an identifier that relates the containing cluster to the ClusterSet in which it belongs.
@@ -422,28 +418,97 @@ required) or even code snippets. If there's any ambiguity about HOW your
422
418
proposal will be implemented, this is the place to discuss them.
423
419
-->
424
420
425
-
### Implementing the `ClusterClaim` CRD
421
+
### Rationale behind the `ClusterClaim` CRD
422
+
423
+
This proposal suggests a CRD composed of objects all of the same `Kind``ClusterClaim`, and that are distinguished using certain well known values in their `metadata.name` fields. This design avoids cluster-wide singleton `Kind`s for each claim, reduces access competition for the same metadata by making each claim its own resource (instead of all in one), allows for RBAC to be applied in a targeted way to individual claims, and supports the user prerogative to store other simple metadata in one centralized CRD by creating CRs of the same `Kind``ClusterClaim` but with their own names.
424
+
425
+
Storing arbitrary facts about a cluster can be implemented in other ways. For example, Cluster API subproject stopgapped their need for cluster name metadata by leveraging the existing `Node` `Kind` and storing metadata there via annotations, such as `cluster.x-k8s.io/cluster-name` ([ref](https://github.com/kubernetes-sigs/cluster-api/pull/4048)). While practical for their case, this KEP avoids adding cluster-level info as annotations on child resources so as not to be dependent on a child resource's existence, to avoid issues maintaining parity across multiple resources of the same `Kind` for identical metadata, and maintain RBAC separation between the cluster-level metadata and the child resources. Even within the realm of implementing as a CRD, the API design could focus on distinguishing each fact by utilizing different `spec.Type`s (as `Service` objects do e.g. `spec.type=ClusterIP` or `spec.type=ExternalName`), or even more strictly, each as a different `Kind`. The former provides no specific advantages since multiple differently named claims for the same fact are unnecessary, and is less expressive to query (it is easier to query by name directly like `kubectl get clusterclaims id.k8s.io`). The latter would result in the proliferation of cluster-wide singleton `Kind` resources, and be burdensome for users to create their own custom claims.
426
+
427
+
428
+
### Implementing the `ClusterClaim` CRD and its admission controllers
426
429
427
430
#### `id.k8s.io ClusterClaim`
428
431
429
432
The actual implementation to select and store the identifier of a given cluster could occur local to the cluster. It does not necessarily ever need to be deleted, particularly if the identifier selection mechanism chooses an identifier that is compliant with this specification's most broad restrictions -- namely, being immutable for a cluster's lifetime and unique beyond just the scope of the cluster's membership. A recommended option that meets these broad restrictions is a cluster's kube-system.uuid.
430
433
431
434
That being said, for less stringent identifiers, for example a user-specified and human-readable value, a given `id.k8s.io ClusterClaim` may need to change if an identical identifier is in use by another member of the ClusterSet it wants to join. It is likely this would need to happen outside the cluster-local boundary; for example, whatever manages memberships would likely need to deny the incoming cluster, and potentially assign (or prompt the cluster to assign itself) a new ID.
432
435
436
+
Since this KEP does not formally mandate that the cluster ID *must* be immutable for the lifetime of the cluster, only for the lifetime of its membership in a ClusterSet, any dependent tooling explicitly *cannot* assume the `id.k8s.io ClusterClaim` for a given cluster will stay constant on its own merit. For example, log aggregation of a given cluster ID based on this claim should only be trusted to be referring to the same cluster for as long as it has one ClusterSet membership; similarly, controllers whose logic depends on distinguishing clusters by cluster ID can only trust this claim to disambiguate the same cluster for as long as the cluster has one ClusterSet membership.
437
+
438
+
Despite this flexibility in the KEP, clusterIDs may still be useful before ClusterSet membership needs to be established; again, particularly if the implementation chooses the broadest restrictions regarding immutability and uniqueness. Therefore, having a controller that initializes it early in the lifecycle of the cluster, and possibly as part of cluster creation, may be a useful place to implement it, though within the bounds of this KEP that is not strictly necessary.
439
+
440
+
The most common discussion point within the SIG regarding whether an implementation should favor a UUID or a human-readable clusterID string is when it comes to DNS. Since DNS names are originally intended to be a human readable technique of address, clunky DNS names composed from long UUIDs seems like an anti-pattern, or at least unfinished. While some extensions to this spec have been discussed as ways to leverage the best parts of both (ex. using labels on the `id.k8s.io ClusterClaim` to store aliases for DNS), an actual API specification to allow for this is outside the scope of this KEP at this time (see the Non-Goals section).
441
+
433
442
```
434
-
<<[UNRESOLVED]>>
435
-
Effect of different identifier styles (mainly UUID vs human readable) on DNS
436
-
<<[/UNRESOLVED]>>
443
+
# An example object of `id.k8s.io ClusterClaim`
444
+
# using a kube-system ns uuid as the id value (recommended above):
445
+
446
+
apiVersion: multicluster.k8s.io/v1
447
+
kind: ClusterClaim
448
+
metadata:
449
+
name: id.k8s.io
450
+
spec:
451
+
value: 721ab723-13bc-11e5-aec2-42010af0021e
452
+
```
453
+
454
+
```
455
+
# An example object of `id.k8s.io ClusterClaim`
456
+
# using a human-readable string as the id value:
457
+
458
+
apiVersion: multicluster.k8s.io/v1
459
+
kind: ClusterClaim
460
+
metadata:
461
+
name: id.k8s.io
462
+
spec:
463
+
value: cluster-1
437
464
```
438
465
439
466
#### `clusterset.k8s.io ClusterClaim`
440
467
468
+
A cluster in a ClusterSet is expected to be authoritatively associated with that ClusterSet by an external process and storage mechanism with a purview above the cluster local boundary, whether that is some form of a cluster registry or just a human running kubectl. (The details of any specific mechanism is out of scope for the MCS API and this KEP -- see the Non-Goals section.) Mirroring this information in the cluster-local `ClusterClaim` CRD will necessarily need to be managed above the level of the cluster itself, since the properties of `clusterset.k8s.io` extend beyond the boundaries of a single cluster, and will likely be something that has access to whatever cluster registry-esque concept is implemented for that multicluster setup. It is expected that the mcs-controller ([as described in the MCS API KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/1645-multi-cluster-services-api#proposal)), will act as an admission controller to verify individual objects of this claim.
469
+
470
+
Because there are obligations of the `id.k8s.io ClusterClaim` that are not meanigfully verifiable until a cluster tries to join a ClusterSet and set its `clusterset.k8s.io ClusterClaim`, the admission controller responsible for setting a `clusterset.k8s.io ClusterClaim` will need the ability to reject such an attempt when it is invalid, and alert `[UNRESOLVED]` or possibly affect changes to that cluster's `id.k8s.io ClusterClaim` to make it valid `[/UNRESOLVED]`. Two symptomatic cases of this would be:
471
+
472
+
1. When a cluster with a given `id.k8s.io ClusterClaim` tries to join a ClusterSet, but a cluster with that same `id.k8s.io ClusterClaim` appears to already be in the set.
473
+
2. When a cluster that does not have a `id.k8s.io ClusterClaim` tries to join a ClusterSet.
474
+
475
+
In situations like these, the admission controller will need to fail to add the invalid cluster to the ClusterSet by refusing to set its `clusterset.k8s.io ClusterClaim`, and surface an error that is actionable to make the claim valid.
476
+
441
477
```
442
-
<<[UNRESOLVED]>>
443
-
How do we associate a cluster with a Clusterset?
444
-
<<[/UNRESOLVED]>>
478
+
# An example object of `clusterset.k8s.io ClusterClaim`:
479
+
480
+
apiVersion: multicluster.k8s.io/v1
481
+
kind: ClusterClaim
482
+
metadata:
483
+
name: clusterset.k8s.io
484
+
spec:
485
+
value: environ-1
445
486
```
446
487
488
+
### CRD upgrade path
489
+
490
+
#### To CRD or not to CRD?
491
+
492
+
_That is the question._
493
+
494
+
While this document has thus far referred to the `ClusterClaim` resource as being implemented as a CRD, another implementation point of debate has been whether this belongs in the core Kubernetes API, particularly the `id.k8s.io ClusterClaim`. A dependable cluster ID or cluster name has previously been discussed in other forums (such as [this SIG-Architecture thread](https://groups.google.com/g/kubernetes-sig-architecture/c/mVGobfD4TpY/m/nkdbkX1iBwAJ) from 2018, or, as mentioned above, the [Cluster API subproject](https://github.com/kubernetes-sigs/cluster-api/issues/4044) which implemented [their own solution](https://github.com/kubernetes-sigs/cluster-api/pull/4048).) It is the opinion of SIG-Multicluster that the function of the proposed `ClusterClaim` CRD is of broad utility and becomes more useful the more ubiquitous it is, not only in multicluster set ups.
495
+
496
+
This has led to the discussion of whether or not we should pursue adding this resource type not as a CRD associated with SIG-Multicluster, but as a core Kubernetes API implemented in `kubernetes/kubernetes`. A short pro/con list is enclosed at the end of this section.
497
+
498
+
One effect of that decision is related to the upgrade path. Implementing this resource only in k/k will restrict the types of clusters that can use cluster ID to only ones on the target version (or above) of Kubernetes, unless a separate backporting CRD is made available to them. At that point, with two install options, other issues arise. How do backported clusters deal with migrating their CRD data to the core k/k objects during upgrade -- will the code around the formal k/k implementation be sensitive to the backport CRD and migrate itself? Will users have to handle upgrades in a bespoke manner?
| Deployment | Must be installed by the cluster lifecycle management, or as a manual setup step | In every cluster over target milestone |
505
+
| Schema validation | OpenAPI v3 validation | Can use the built-in Kubernetes schema validation |
506
+
| Blockers | Official API review if using *.k8s.io | Official API review |
507
+
| Conformance testing | Not possible now, and no easy path forward | Standard |
508
+
509
+
**In the end, SIG-Multicluster discussed this with SIG-Architecture and it was decided to stick with the plan to use a CRD.** Notes from this conversation are in the [SIG-Architecture meeting agenda](https://docs.google.com/document/d/1BlmHq5uPyBUDlppYqAAzslVbAO8hilgjqZUTaNXUhKM/preview) for 3/25/2021.
510
+
511
+
447
512
### Test Plan
448
513
449
514
<!--
@@ -469,6 +534,11 @@ when drafting this test plan.
469
534
#### Alpha -> Beta Graduation
470
535
471
536
- Determine if an `id.k8s.io ClusterClaim` be strictly a valid DNS label, or is allowed to be a subdomain.
537
+
- To CRD or not to CRD (see section above)
538
+
539
+
#### Beta -> GA criteria
540
+
541
+
- At least one headless implementation using clusterID for MCS DNS
472
542
473
543
<!--
474
544
**Note:** *Not required until targeted at a release.*
0 commit comments