AutoScalingConfig defines the configuration for the horizontal pod autoscaler.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
minReplicas integer |
MinReplicas is the lower limit for the number of replicas for the target resource. It will be used by the horizontal pod autoscaler to determine the minimum number of replicas to scale-in to. |
||
maxReplicas integer |
maxReplicas is the upper limit for the number of replicas to which the autoscaler can scale up. It cannot be less that minReplicas. |
||
metrics MetricSpec array |
Metrics contains the specifications for which to use to calculate the desired replica count (the maximum replica count across all metrics will be used). The desired replica count is calculated multiplying the ratio between the target value and the current value by the current number of pods. Ergo, metrics used must decrease as the pod count is increased, and vice versa. See the individual metric source types for more information about how each type of metric must respond. If not set, the default metric will be set to 80% average CPU utilization. |
Underlying type: string
CliqueStartupType defines the order in which each PodClique is started.
Validation:
- Enum: [CliqueStartupTypeAnyOrder CliqueStartupTypeInOrder CliqueStartupTypeExplicit]
Appears in:
| Field | Description |
|---|---|
CliqueStartupTypeAnyOrder |
CliqueStartupTypeAnyOrder defines that the cliques can be started in any order. This allows for concurrent starts of cliques. This is the default CliqueStartupType. |
CliqueStartupTypeInOrder |
CliqueStartupTypeInOrder defines that the cliques should be started in the order they are defined in the PodGang Cliques slice. |
CliqueStartupTypeExplicit |
CliqueStartupTypeExplicit defines that the cliques should be started after the cliques defined in PodClique.StartsAfter have started. |
ClusterTopology defines the topology hierarchy for the cluster. This resource is immutable after creation.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
grove.io/v1alpha1 |
||
kind string |
ClusterTopology |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec ClusterTopologySpec |
Spec defines the topology hierarchy specification. |
ClusterTopologySpec defines the topology hierarchy specification.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
levels TopologyLevel array |
Levels is an ordered list of topology levels from broadest to narrowest scope. The order in this list defines the hierarchy (index 0 = broadest level). This field is immutable after creation. |
MaxItems: 7 MinItems: 1 |
Underlying type: string
ErrorCode is a custom error code that uniquely identifies an error.
Appears in:
HeadlessServiceConfig defines the config options for the headless service.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
publishNotReadyAddresses boolean |
PublishNotReadyAddresses if set to true will publish the DNS records of pods even if the pods are not ready. if not set, it defaults to true. |
true |
LastError captures the last error observed by the controller when reconciling an object.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
code ErrorCode |
Code is the error code that uniquely identifies the error. | ||
description string |
Description is a human-readable description of the error. | ||
observedAt Time |
ObservedAt is the time at which the error was observed. |
Underlying type: string
LastOperationState is a string alias for the state of the last operation.
Appears in:
| Field | Description |
|---|---|
Processing |
LastOperationStateProcessing indicates that the last operation is in progress. |
Succeeded |
LastOperationStateSucceeded indicates that the last operation succeeded. |
Error |
LastOperationStateError indicates that the last operation completed with errors and will be retried. |
Underlying type: string
LastOperationType is a string alias for the type of the last operation.
Appears in:
| Field | Description |
|---|---|
Reconcile |
LastOperationTypeReconcile indicates that the last operation was a reconcile operation. |
Delete |
LastOperationTypeDelete indicates that the last operation was a delete operation. |
PodClique is a set of pods running the same image.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
grove.io/v1alpha1 |
||
kind string |
PodClique |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec PodCliqueSpec |
Spec defines the specification of a PodClique. | ||
status PodCliqueStatus |
Status defines the status of a PodClique. |
PodCliqueRollingUpdateProgress provides details about the ongoing rolling update of the PodClique.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
updateStartedAt Time |
UpdateStartedAt is the time at which the rolling update started. | ||
updateEndedAt Time |
UpdateEndedAt is the time at which the rolling update ended. It will be set to nil if the rolling update is still in progress. |
||
podCliqueSetGenerationHash string |
PodCliqueSetGenerationHash is the PodCliqueSet generation hash corresponding to the PodCliqueSet spec that is being rolled out. While the update is in progress PodCliqueStatus.CurrentPodCliqueSetGenerationHash will not match this hash. Once the update is complete the value of this field will be copied to PodCliqueStatus.CurrentPodCliqueSetGenerationHash. |
||
podTemplateHash string |
PodTemplateHash is the PodClique template hash corresponding to the PodClique spec that is being rolled out. While the update is in progress PodCliqueStatus.CurrentPodTemplateHash will not match this hash. Once the update is complete the value of this field will be copied to PodCliqueStatus.CurrentPodTemplateHash. |
||
readyPodsSelectedToUpdate PodsSelectedToUpdate |
ReadyPodsSelectedToUpdate captures the pod names of ready Pods that are either currently being updated or have been previously updated. |
PodCliqueScalingGroup is the schema to define scaling groups that is used to scale a group of PodClique's. An instance of this custom resource will be created for every pod clique scaling group defined as part of PodCliqueSet.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
grove.io/v1alpha1 |
||
kind string |
PodCliqueScalingGroup |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec PodCliqueScalingGroupSpec |
Spec is the specification of the PodCliqueScalingGroup. | ||
status PodCliqueScalingGroupStatus |
Status is the status of the PodCliqueScalingGroup. |
PodCliqueScalingGroupConfig is a group of PodClique's that are scaled together. Each member PodClique.Replicas will be computed as a product of PodCliqueScalingGroupConfig.Replicas and PodCliqueTemplateSpec.Spec.Replicas. NOTE: If a PodCliqueScalingGroupConfig is defined, then for the member PodClique's, individual AutoScalingConfig cannot be defined.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the name of the PodCliqueScalingGroupConfig. This should be unique within the PodCliqueSet. It allows consumers to give a semantic name to a group of PodCliques that needs to be scaled together. |
||
cliqueNames string array |
CliqueNames is the list of names of the PodClique's that are part of the scaling group. | ||
replicas integer |
Replicas is the desired number of replicas for the scaling group at template level. This allows one to control the replicas of the scaling group at startup. If not specified, it defaults to 1. |
1 | |
minAvailable integer |
MinAvailable serves two purposes: Gang Scheduling: It defines the minimum number of replicas that are guaranteed to be gang scheduled. Gang Termination: It defines the minimum requirement of available replicas for a PodCliqueScalingGroup. Violation of this threshold for a duration beyond TerminationDelay will result in termination of the PodCliqueSet replica that it belongs to. Default: If not specified, it defaults to 1. Constraints: MinAvailable cannot be greater than Replicas. If ScaleConfig is defined then its MinAvailable should not be less than ScaleConfig.MinReplicas. |
1 | |
scaleConfig AutoScalingConfig |
ScaleConfig is the horizontal pod autoscaler configuration for the pod clique scaling group. | ||
topologyConstraint TopologyConstraint |
TopologyConstraint defines topology placement requirements for PodCliqueScalingGroup. Must be equal to or stricter than parent PodCliqueSet constraints. |
PodCliqueScalingGroupReplicaRollingUpdateProgress provides details about the rolling update progress of ready replicas of PodCliqueScalingGroup that have been selected for update.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
current integer |
Current is the index of the PodCliqueScalingGroup replica that is currently being updated. | ||
completed integer array |
Completed is the list of indices of PodCliqueScalingGroup replicas that have been updated to the latest PodCliqueSet spec. |
PodCliqueScalingGroupRollingUpdateProgress provides details about the ongoing rolling update of the PodCliqueScalingGroup.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
updateStartedAt Time |
UpdateStartedAt is the time at which the rolling update started. | ||
updateEndedAt Time |
UpdateEndedAt is the time at which the rolling update ended. | ||
podCliqueSetGenerationHash string |
PodCliqueSetGenerationHash is the PodCliqueSet generation hash corresponding to the PodCliqueSet spec that is being rolled out. While the update is in progress PodCliqueScalingGroupStatus.CurrentPodCliqueSetGenerationHash will not match this hash. Once the update is complete the value of this field will be copied to PodCliqueScalingGroupStatus.CurrentPodCliqueSetGenerationHash. |
||
updatedPodCliques string array |
UpdatedPodCliques is the list of PodClique names that have been updated to the latest PodCliqueSet spec. | ||
readyReplicaIndicesSelectedToUpdate PodCliqueScalingGroupReplicaRollingUpdateProgress |
ReadyReplicaIndicesSelectedToUpdate provides the rolling update progress of ready replicas of PodCliqueScalingGroup that have been selected for update. PodCliqueScalingGroup replicas that are either pending or unhealthy will be force updated and the update will not wait for these replicas to become ready. For all ready replicas, one replica is chosen at a time to update, once it is updated and becomes ready, the next ready replica is chosen for update. |
PodCliqueScalingGroupSpec is the specification of the PodCliqueScalingGroup.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
replicas integer |
Replicas is the desired number of replicas for the PodCliqueScalingGroup. If not specified, it defaults to 1. |
1 | |
minAvailable integer |
MinAvailable specifies the minimum number of ready replicas required for a PodCliqueScalingGroup to be considered operational. A PodCliqueScalingGroup replica is considered "ready" when its associated PodCliques have sufficient ready or starting pods. If MinAvailable is breached, it will be used to signal that the PodCliqueScalingGroup is no longer operating with the desired availability. MinAvailable cannot be greater than Replicas. If ScaleConfig is defined then its MinAvailable should not be less than ScaleConfig.MinReplicas. It serves two main purposes: 1. Gang Scheduling: MinAvailable defines the minimum number of replicas that are guaranteed to be gang scheduled. 2. Gang Termination: MinAvailable is used as a lower bound below which a PodGang becomes a candidate for Gang termination. If not specified, it defaults to 1. |
1 | |
cliqueNames string array |
CliqueNames is the list of PodClique names that are configured in the matching PodCliqueScalingGroup in PodCliqueSet.Spec.Template.PodCliqueScalingGroupConfigs. |
PodCliqueScalingGroupStatus is the status of the PodCliqueScalingGroup.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
replicas integer |
Replicas is the observed number of replicas for the PodCliqueScalingGroup. | ||
scheduledReplicas integer |
ScheduledReplicas is the number of replicas that are scheduled for the PodCliqueScalingGroup. A replica of PodCliqueScalingGroup is considered "scheduled" when at least MinAvailable number of pods in each constituent PodClique has been scheduled. |
0 | |
availableReplicas integer |
AvailableReplicas is the number of PodCliqueScalingGroup replicas that are available. A PodCliqueScalingGroup replica is considered available when all constituent PodClique's have PodClique.Status.ReadyReplicas greater than or equal to PodClique.Spec.MinAvailable |
0 | |
updatedReplicas integer |
UpdatedReplicas is the number of PodCliqueScalingGroup replicas that correspond with the latest PodCliqueSetGenerationHash. | 0 | |
selector string |
Selector is the selector used to identify the pods that belong to this scaling group. | ||
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller. | ||
lastErrors LastError array |
LastErrors captures the last errors observed by the controller when reconciling the PodClique. | ||
conditions Condition array |
Conditions represents the latest available observations of the PodCliqueScalingGroup by its controller. | ||
currentPodCliqueSetGenerationHash string |
CurrentPodCliqueSetGenerationHash establishes a correlation to PodCliqueSet generation hash indicating that the spec of the PodCliqueSet at this generation is fully realized in the PodCliqueScalingGroup. |
||
rollingUpdateProgress PodCliqueScalingGroupRollingUpdateProgress |
RollingUpdateProgress provides details about the ongoing rolling update of the PodCliqueScalingGroup. |
PodCliqueSet is a set of PodGangs defining specification on how to spread and manage a gang of pods and monitoring their status.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
grove.io/v1alpha1 |
||
kind string |
PodCliqueSet |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec PodCliqueSetSpec |
Spec defines the specification of the PodCliqueSet. | ||
status PodCliqueSetStatus |
Status defines the status of the PodCliqueSet. |
PodCliqueSetReplicaRollingUpdateProgress captures the progress of a rolling update for a specific PodCliqueSet replica.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
replicaIndex integer |
ReplicaIndex is the replica index of the PodCliqueSet that is being updated. | ||
updateStartedAt Time |
UpdateStartedAt is the time at which the rolling update started for this PodCliqueSet replica index. |
PodCliqueSetRollingUpdateProgress captures the progress of a rolling update of the PodCliqueSet.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
updateStartedAt Time |
UpdateStartedAt is the time at which the rolling update started for the PodCliqueSet. | ||
updateEndedAt Time |
UpdateEndedAt is the time at which the rolling update ended for the PodCliqueSet. | ||
updatedPodCliqueScalingGroups string array |
UpdatedPodCliqueScalingGroups is a list of PodCliqueScalingGroup names that have been updated to the desired PodCliqueSet generation hash. | ||
updatedPodCliques string array |
UpdatedPodCliques is a list of PodClique names that have been updated to the desired PodCliqueSet generation hash. | ||
currentlyUpdating PodCliqueSetReplicaRollingUpdateProgress |
CurrentlyUpdating captures the progress of the PodCliqueSet replica that is currently being updated. |
PodCliqueSetSpec defines the specification of a PodCliqueSet.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
replicas integer |
Replicas is the number of desired replicas of the PodCliqueSet. | 0 | |
template PodCliqueSetTemplateSpec |
Template describes the template spec for PodGangs that will be created in the PodCliqueSet. |
PodCliqueSetStatus defines the status of a PodCliqueSet.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller. | ||
conditions Condition array |
Conditions represents the latest available observations of the PodCliqueSet by its controller. | ||
lastErrors LastError array |
LastErrors captures the last errors observed by the controller when reconciling the PodCliqueSet. | ||
replicas integer |
Replicas is the total number of PodCliqueSet replicas created. | ||
updatedReplicas integer |
UpdatedReplicas is the number of replicas that have been updated to the desired revision of the PodCliqueSet. | 0 | |
availableReplicas integer |
AvailableReplicas is the number of PodCliqueSet replicas that are available. A PodCliqueSet replica is considered available when all standalone PodCliques within that replica have MinAvailableBreached condition = False AND all PodCliqueScalingGroups (PCSG) within that replica have MinAvailableBreached condition = False. |
0 | |
hpaPodSelector string |
Selector is the label selector that determines which pods are part of the PodGang. PodGang is a unit of scale and this selector is used by HPA to scale the PodGang based on metrics captured for the pods that match this selector. |
||
podGangStatuses PodGangStatus array |
PodGangStatuses captures the status for all the PodGang's that are part of the PodCliqueSet. | ||
currentGenerationHash string |
CurrentGenerationHash is a hash value generated out of a collection of fields in a PodCliqueSet. Since only a subset of fields is taken into account when generating the hash, not every change in the PodCliqueSetSpec will be accounted for when generating this hash value. A field in PodCliqueSetSpec is included if a change to it triggers a rolling update of PodCliques and/or PodCliqueScalingGroups. Only if this value is not nil and the newly computed hash value is different from the persisted CurrentGenerationHash value then a rolling update needs to be triggerred. |
||
rollingUpdateProgress PodCliqueSetRollingUpdateProgress |
RollingUpdateProgress represents the progress of a rolling update. |
PodCliqueSetTemplateSpec defines a template spec for a PodGang. A PodGang does not have a RestartPolicy field because the restart policy is predefined: If the number of pods in any of the cliques falls below the threshold, the entire PodGang will be restarted. The threshold is determined by either:
- The value of "MinReplicas", if specified in the ScaleConfig of that clique, or
- The "Replicas" value of that clique
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
cliques PodCliqueTemplateSpec array |
Cliques is a slice of cliques that make up the PodGang. There should be at least one PodClique. | ||
cliqueStartupType CliqueStartupType |
StartupType defines the type of startup dependency amongst the cliques within a PodGang. If it is not defined then default of CliqueStartupTypeAnyOrder is used. |
CliqueStartupTypeAnyOrder | Enum: [CliqueStartupTypeAnyOrder CliqueStartupTypeInOrder CliqueStartupTypeExplicit] |
priorityClassName string |
PriorityClassName is the name of the PriorityClass to be used for the PodCliqueSet. If specified, indicates the priority of the PodCliqueSet. "system-node-critical" and "system-cluster-critical" are two special keywords which indicate the highest priorities with the former being the highest priority. Any other name must be defined by creating a PriorityClass object with that name. If not specified, the pod priority will be default or zero if there is no default. |
||
headlessServiceConfig HeadlessServiceConfig |
HeadlessServiceConfig defines the config options for the headless service. If present, create headless service for each PodGang. |
||
topologyConstraint TopologyConstraint |
TopologyConstraint defines topology placement requirements for PodCliqueSet. | ||
terminationDelay Duration |
TerminationDelay is the delay after which the gang termination will be triggered. A gang is a candidate for termination if number of running pods fall below a threshold for any PodClique. If a PodGang remains a candidate past TerminationDelay then it will be terminated. This allows additional time to the kube-scheduler to re-schedule sufficient pods in the PodGang that will result in having the total number of running pods go above the threshold. Defaults to 4 hours. |
||
podCliqueScalingGroups PodCliqueScalingGroupConfig array |
PodCliqueScalingGroupConfigs is a list of scaling groups for the PodCliqueSet. |
PodCliqueSpec defines the specification of a PodClique.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
roleName string |
RoleName is the name of the role that this PodClique will assume. | ||
podSpec PodSpec |
Spec is the spec of the pods in the clique. | ||
replicas integer |
Replicas is the number of replicas of the pods in the clique. It cannot be less than 1. | ||
minAvailable integer |
MinAvailable serves two purposes: 1. It defines the minimum number of pods that are guaranteed to be gang scheduled. 2. It defines the minimum requirement of available pods in a PodClique. Violation of this threshold will result in termination of the PodGang that it belongs to. If MinAvailable is not set, then it will default to the template Replicas. |
||
startsAfter string array |
StartsAfter provides you a way to explicitly define the startup dependencies amongst cliques. If CliqueStartupType in PodGang has been set to 'CliqueStartupTypeExplicit', then to create an ordered start amongst PodClique's StartsAfter can be used. A forest of DAG's can be defined to model any start order dependencies. If there are more than one PodClique's defined and StartsAfter is not set for any of them, then their startup order is random at best and must not be relied upon. Validations: 1. If a StartsAfter has been defined and one or more cycles are detected in DAG's then it will be flagged as validation error. 2. If StartsAfter is defined and does not identify any PodClique then it will be flagged as a validation error. |
||
autoScalingConfig AutoScalingConfig |
ScaleConfig is the horizontal pod autoscaler configuration for a PodClique. |
PodCliqueStatus defines the status of a PodClique.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller. | ||
lastErrors LastError array |
LastErrors captures the last errors observed by the controller when reconciling the PodClique. | ||
replicas integer |
Replicas is the total number of non-terminated Pods targeted by this PodClique. | ||
readyReplicas integer |
ReadyReplicas is the number of ready Pods targeted by this PodClique. | 0 | |
updatedReplicas integer |
UpdatedReplicas is the number of Pods that have been updated and are at the desired revision of the PodClique. | 0 | |
scheduleGatedReplicas integer |
ScheduleGatedReplicas is the number of Pods that have been created with one or more scheduling gate(s) set. Sum of ReadyReplicas and ScheduleGatedReplicas will always be <= Replicas. |
0 | |
scheduledReplicas integer |
ScheduledReplicas is the number of Pods that have been scheduled by the kube-scheduler. | 0 | |
hpaPodSelector string |
Selector is the label selector that determines which pods are part of the PodClique. PodClique is a unit of scale and this selector is used by HPA to scale the PodClique based on metrics captured for the pods that match this selector. |
||
conditions Condition array |
Conditions represents the latest available observations of the clique by its controller. | ||
currentPodCliqueSetGenerationHash string |
CurrentPodCliqueSetGenerationHash establishes a correlation to PodCliqueSet generation hash indicating that the spec of the PodCliqueSet at this generation is fully realized in the PodClique. |
||
currentPodTemplateHash string |
CurrentPodTemplateHash establishes a correlation to PodClique template hash indicating that the spec of the PodClique at this template hash is fully realized in the PodClique. |
||
rollingUpdateProgress PodCliqueRollingUpdateProgress |
RollingUpdateProgress provides details about the ongoing rolling update of the PodClique. |
PodCliqueTemplateSpec defines a template spec for a PodClique.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name must be unique within a PodCliqueSet and is used to denote a role. Once set it cannot be updated. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names#names |
||
labels object (keys:string, values:string) |
Labels is a map of string keys and values that can be used to organize and categorize (scope and select) objects. May match selectors of replication controllers and services. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels |
||
annotations object (keys:string, values:string) |
Annotations is an unstructured key value map stored with a resource that may be set by external tools to store and retrieve arbitrary metadata. They are not queryable and should be preserved when modifying objects. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations |
||
topologyConstraint TopologyConstraint |
TopologyConstraint defines topology placement requirements for PodClique. Must be equal to or stricter than parent resource constraints. |
||
spec PodCliqueSpec |
Specification of the desired behavior of a PodClique. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status |
Underlying type: string
PodGangPhase represents the phase of a PodGang.
Validation:
- Enum: [Pending Starting Running Failed Succeeded]
Appears in:
| Field | Description |
|---|---|
Pending |
PodGangPending indicates that the pods in a PodGang have not yet been taken up for scheduling. |
Starting |
PodGangStarting indicates that the pods are bound to nodes by the scheduler and are starting. |
Running |
PodGangRunning indicates that the all the pods in a PodGang are running. |
Failed |
PodGangFailed indicates that one or more pods in a PodGang have failed. This is a terminal state and is typically used for batch jobs. |
Succeeded |
PodGangSucceeded indicates that all the pods in a PodGang have succeeded. This is a terminal state and is typically used for batch jobs. |
PodGangStatus defines the status of a PodGang.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the name of the PodGang. | ||
phase PodGangPhase |
Phase is the current phase of the PodGang. | Enum: [Pending Starting Running Failed Succeeded] |
|
conditions Condition array |
Conditions represents the latest available observations of the PodGang by its controller. |
PodsSelectedToUpdate captures the current and previous set of pod names that have been selected for update in a rolling update.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
current string |
Current captures the current pod name that is a target for update. | ||
completed string array |
Completed captures the pod names that have already been updated. |
TopologyConstraint defines topology placement requirements.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
packDomain TopologyDomain |
PackDomain specifies the topology domain for grouping replicas. Controls placement constraint for EACH individual replica instance. Must be one of: region, zone, datacenter, block, rack, host, numa Example: "rack" means each replica independently placed within one rack. Note: Does NOT constrain all replicas to the same rack together. Different replicas can be in different topology domains. |
Enum: [region zone datacenter block rack host numa] |
Underlying type: string
TopologyDomain represents a level in the cluster topology hierarchy.
Appears in:
| Field | Description |
|---|---|
region |
TopologyDomainRegion represents the region level in the topology hierarchy. |
zone |
TopologyDomainZone represents the zone level in the topology hierarchy. |
datacenter |
TopologyDomainDataCenter represents the datacenter level in the topology hierarchy. |
block |
TopologyDomainBlock represents the block level in the topology hierarchy. |
rack |
TopologyDomainRack represents the rack level in the topology hierarchy. |
host |
TopologyDomainHost represents the host level in the topology hierarchy. |
numa |
TopologyDomainNuma represents the numa level in the topology hierarchy. |
TopologyLevel defines a single level in the topology hierarchy. Maps a platform-agnostic domain to a platform-specific node label key, allowing workload operators a consistent way to reference topology levels when defining TopologyConstraint's.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
domain TopologyDomain |
Domain is a platform provider-agnostic level identifier. Must be one of: region, zone, datacenter, block, rack, host, numa |
Enum: [region zone datacenter block rack host numa] Required: {} |
|
key string |
Key is the node label key that identifies this topology domain. Must be a valid Kubernetes label key (qualified name). Examples: "topology.kubernetes.io/zone", "kubernetes.io/hostname" |
MaxLength: 63 MinLength: 1 Pattern: ^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]/)?([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]$ Required: {} |
AuthorizerConfig defines the configuration for the authorizer admission webhook.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled indicates whether the authorizer is enabled. | ||
exemptServiceAccountUserNames string array |
ExemptServiceAccountUserNames is a list of service account usernames that are exempt from authorizer checks. Each service account username name in ExemptServiceAccountUserNames should be of the following format: system:serviceaccount::. ServiceAccounts are represented in this format when checking the username in authenticationv1.UserInfo.Name. |
Underlying type: string
CertProvisionMode defines how webhook certificates are provisioned.
Validation:
- Enum: [auto manual]
Appears in:
| Field | Description |
|---|---|
auto |
CertProvisionModeAuto enables automatic certificate generation and management via cert-controller. cert-controller automatically generates self-signed certificates and stores them in the Secret. |
manual |
CertProvisionModeManual expects certificates to be provided externally (e.g., by cert-manager, cluster admin). |
ClientConnectionConfiguration defines the configuration for constructing a client.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
qps float |
QPS controls the number of queries per second allowed for this connection. | ||
burst integer |
Burst allows extra queries to accumulate when a client is exceeding its rate. | ||
contentType string |
ContentType is the content type used when sending data to the server from this client. | ||
acceptContentTypes string |
AcceptContentTypes defines the Accept header sent by clients when connecting to the server, overriding the default value of 'application/json'. This field will control all connections to the server used by a particular client. |
ControllerConfiguration defines the configuration for the controllers.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
podCliqueSet PodCliqueSetControllerConfiguration |
PodCliqueSet is the configuration for the PodCliqueSet controller. | ||
podClique PodCliqueControllerConfiguration |
PodClique is the configuration for the PodClique controller. | ||
podCliqueScalingGroup PodCliqueScalingGroupControllerConfiguration |
PodCliqueScalingGroup is the configuration for the PodCliqueScalingGroup controller. |
DebuggingConfiguration defines the configuration for debugging.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enableProfiling boolean |
EnableProfiling enables profiling via host:port/debug/pprof/ endpoints. |
LeaderElectionConfiguration defines the configuration for the leader election.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled specifies whether leader election is enabled. Set this to true when running replicated instances of the operator for high availability. |
||
leaseDuration Duration |
LeaseDuration is the duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of the occupied but un-renewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. |
||
renewDeadline Duration |
RenewDeadline is the interval between attempts by the acting leader to renew its leadership before it stops leading. This must be less than or equal to the lease duration. This is only applicable if leader election is enabled. |
||
retryPeriod Duration |
RetryPeriod is the duration leader elector clients should wait between attempting acquisition and renewal of leadership. This is only applicable if leader election is enabled. |
||
resourceLock string |
ResourceLock determines which resource lock to use for leader election. This is only applicable if leader election is enabled. |
||
resourceName string |
ResourceName determines the name of the resource that leader election will use for holding the leader lock. This is only applicable if leader election is enabled. |
||
resourceNamespace string |
ResourceNamespace determines the namespace in which the leader election resource will be created. This is only applicable if leader election is enabled. |
Underlying type: string
LogFormat defines the format of the log.
Appears in:
| Field | Description |
|---|---|
json |
LogFormatJSON is the JSON log format. |
text |
LogFormatText is the text log format. |
Underlying type: string
LogLevel defines the log level.
Appears in:
| Field | Description |
|---|---|
debug |
DebugLevel is the debug log level, i.e. the most verbose. |
info |
InfoLevel is the default log level. |
error |
ErrorLevel is a log level where only errors are logged. |
NetworkAcceleration defines the configuration for network acceleration features.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
autoMNNVLEnabled boolean |
AutoMNNVLEnabled indicates whether automatic MNNVL (Multi-Node NVLink) support is enabled. When enabled, the operator will automatically create and manage ComputeDomain resources for GPU workloads. If the cluster doesn't have the NVIDIA DRA driver installed, the operator will exit with a non-zero exit code. Default: false |
PodCliqueControllerConfiguration defines the configuration for the PodClique controller.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
concurrentSyncs integer |
ConcurrentSyncs is the number of workers used for the controller to concurrently work on events. |
PodCliqueScalingGroupControllerConfiguration defines the configuration for the PodCliqueScalingGroup controller.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
concurrentSyncs integer |
ConcurrentSyncs is the number of workers used for the controller to concurrently work on events. |
PodCliqueSetControllerConfiguration defines the configuration for the PodCliqueSet controller.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
concurrentSyncs integer |
ConcurrentSyncs is the number of workers used for the controller to concurrently work on events. |
SchedulerConfiguration configures scheduler profiles and which is the default.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
profiles SchedulerProfile array |
Profiles is the list of scheduler profiles. Each profile has a backend name and optional config. The kube-scheduler backend is always enabled; use profile name "kube-scheduler" to configure or set it as default. Valid profile names: "kube-scheduler", "kai-scheduler". Use defaultProfileName to designate the default backend. If not set, defaulting sets it to "kube-scheduler". |
||
defaultProfileName string |
DefaultProfileName is the name of the default scheduler profile. If unset, defaulting sets it to "kube-scheduler". |
Underlying type: string
SchedulerName defines the name of the scheduler backend (used in OperatorConfiguration scheduler.profiles[].name).
Appears in:
| Field | Description |
|---|---|
kai-scheduler |
SchedulerNameKai is the KAI scheduler backend. |
kube-scheduler |
SchedulerNameKube is the profile name for the Kubernetes default scheduler in OperatorConfiguration. |
SchedulerProfile defines a scheduler backend profile with optional backend-specific config.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name SchedulerName |
Name is the scheduler profile name. Valid values: "kube-scheduler", "kai-scheduler". For the Kubernetes default scheduler use "kube-scheduler"; Pod.Spec.SchedulerName will be set to "default-scheduler". |
Enum: [kai-scheduler kube-scheduler] Required: {} |
|
config RawExtension |
Config holds backend-specific options. The operator unmarshals it into the config type for this backend (see backend config types). |
Server contains information for HTTP(S) server configuration.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
bindAddress string |
BindAddress is the IP address on which to listen for the specified port. | ||
port integer |
Port is the port on which to serve requests. |
ServerConfiguration defines the configuration for the HTTP(S) servers.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
webhooks WebhookServer |
Webhooks is the configuration for the HTTP(S) webhook server. | ||
healthProbes Server |
HealthProbes is the configuration for serving the healthz and readyz endpoints. | ||
metrics Server |
Metrics is the configuration for serving the metrics endpoint. |
TopologyAwareSchedulingConfiguration defines the configuration for topology-aware scheduling.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled indicates whether topology-aware scheduling is enabled. | ||
levels TopologyLevel array |
Levels is an ordered list of topology levels from broadest to narrowest scope. Used to create/update the TopologyAwareScheduling CR at operator startup. |
WebhookServer defines the configuration for the HTTP(S) webhook server.
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
bindAddress string |
BindAddress is the IP address on which to listen for the specified port. | ||
port integer |
Port is the port on which to serve requests. | ||
serverCertDir string |
ServerCertDir is the directory containing the server certificate and key. | ||
secretName string |
SecretName is the name of the Kubernetes Secret containing webhook certificates. The Secret must contain tls.crt, tls.key, and ca.crt. |
grove-webhook-server-cert | |
certProvisionMode CertProvisionMode |
CertProvisionMode controls how webhook certificates are provisioned. | auto | Enum: [auto manual] |