Skip to content

Commit 9c79b62

Browse files
committed
update goldens.yaml
1 parent 1e16b95 commit 9c79b62

File tree

2 files changed

+223
-18
lines changed

2 files changed

+223
-18
lines changed

goldens/Cluster_create_private.txt

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -212,17 +212,4 @@ kubectl apply -f a63aa3c4593c38ad90671fd8b067d1886f6313ad558379b364b51791aa50f4e
212212
kubectl apply -f 1d13ddebae3c90a05ba26b312df088982dd0df0edc4f4013b88384e476c20486
213213
[XPK] GKE commands done! Resources are created.
214214
[XPK] See your GKE Cluster here: https://console.cloud.google.com/kubernetes/clusters/details/us-central1/golden-cluster-private/details?project=golden-project
215-
Traceback (most recent call last):
216-
File "/usr/local/google/home/lidanny/Desktop/Project/diagon_xpk/.xpkenv/bin/xpk", line 7, in <module>
217-
sys.exit(main())
218-
~~~~^^
219-
File "/usr/local/google/home/lidanny/Desktop/Project/diagon_xpk/cienet_xpk/xpk/src/xpk/main.py", line 82, in main
220-
main_args.func(main_args)
221-
~~~~~~~~~~~~~~^^^^^^^^^^^
222-
File "/usr/local/google/home/lidanny/Desktop/Project/diagon_xpk/cienet_xpk/xpk/src/xpk/commands/cluster.py", line 765, in cluster_create_pathways
223-
cluster_create(args)
224-
~~~~~~~~~~~~~~^^^^^^
225-
File "/usr/local/google/home/lidanny/Desktop/Project/diagon_xpk/cienet_xpk/xpk/src/xpk/commands/cluster.py", line 411, in cluster_create
226-
if args.managed_mldiagnostics:
227-
^^^^^^^^^^^^^^^^^^^^^^^^^^
228-
AttributeError: 'Namespace' object has no attribute 'managed_mldiagnostics'
215+
[XPK] Exiting XPK cleanly
Lines changed: 222 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,222 @@
1-
$ xpk cluster create-pathways --project=golden-project --zone=us-central1-a --enable-autoprovisioning --cluster=golden-cluster --tpu-type=tpu7x-8 --on-demand --dry-run --managed-mldiagnostics
2-
usage: xpk [-h]
3-
{workload,storage,cluster,inspector,info,batch,job,kind,shell,version,config,run} ...
4-
xpk: error: unrecognized arguments: --managed-mldiagnostics
1+
$ xpk cluster create-pathways --project=golden-project --zone=us-central1-a --enable-autoprovisioning --cluster=golden-cluster --tpu-type=tpu7x-8 --on-demand --dry-run
2+
[XPK] Starting xpk v0.14.3
3+
[XPK] Starting cluster create for cluster golden-cluster:
4+
[XPK] Working on golden-project and us-central1-a
5+
[XPK] Task: `Determine server supported GKE versions for default rapid gke version` is implemented by the following command not running since it is a dry run.
6+
gcloud container get-server-config --project=golden-project --region=us-central1 --flatten="channels" --filter="channels.channel=RAPID" --format="value(channels.defaultVersion)"
7+
[XPK] Task: `Determine server supported GKE versions for valid versions` is implemented by the following command not running since it is a dry run.
8+
gcloud container get-server-config --project=golden-project --region=us-central1 --flatten="channels" --filter="channels.channel=RAPID" --format="value(channels.validVersions)"
9+
[XPK] Task: `Find if Cluster Exists` is implemented by the following command not running since it is a dry run.
10+
gcloud container clusters list --project=golden-project --filter=location~"us-central1.*" --format="csv[no-heading](name)"
11+
[XPK] Task: `GKE Cluster Create` is implemented by the following command not running since it is a dry run.
12+
gcloud beta container clusters create golden-cluster --project=golden-project --region=us-central1 --node-locations=us-central1-a --cluster-version=0 --machine-type=e2-standard-16 --enable-autoscaling --total-min-nodes 1 --total-max-nodes 1000 --num-nodes 6 --enable-dns-access --autoscaling-profile=optimize-utilization --labels=gke_product_type=xpk --location-policy=BALANCED --scopes=storage-full,gke-default --enable-ip-alias
13+
[XPK] Task: `Find cluster region or zone` is implemented by the following command not running since it is a dry run.
14+
gcloud container clusters list --project=golden-project --filter=name=golden-cluster --format="value(location)"
15+
[XPK] Task: `Check if Private Nodes is enabled in cluster.` is implemented by the following command not running since it is a dry run.
16+
gcloud container clusters describe golden-cluster --project=golden-project --location=us-central1 --format="value(privateClusterConfig.enablePrivateNodes)"
17+
[XPK] Private Nodes is not enabled on the cluster.
18+
[XPK] Cluster is public and no need to authorize networks.
19+
[XPK] Try 1: get-credentials-dns-endpoint to cluster golden-cluster
20+
[XPK] Task: `get-credentials-dns-endpoint to cluster golden-cluster` is implemented by the following command not running since it is a dry run.
21+
gcloud container clusters get-credentials golden-cluster --location=us-central1 --dns-endpoint --project=golden-project && kubectl config view && kubectl config set-context --current --namespace=default
22+
[XPK] Testing credentials with kubectl...
23+
[XPK] Task: `kubectl get pods` is implemented by the following command not running since it is a dry run.
24+
kubectl get pods
25+
[XPK] Credentials test succeeded.
26+
[XPK] Finished get-credentials and kubectl setup.
27+
[XPK] Task: 'Checking CoreDNS deployment existence' in progress for namespace: kube-system
28+
[XPK] Task: `Check CoreDNS deployment in kube-system` is implemented by the following command not running since it is a dry run.
29+
kubectl get deployment coredns -n kube-system
30+
[XPK] Now verifying CoreDNS readiness...
31+
[XPK] Task: `Waiting for kubeDNS to be checked.` is implemented by the following command not running since it is a dry run.
32+
kubectl get deployment kube-dns -n kube-system --ignore-not-found
33+
[XPK] kube-dns deployment not found.
34+
[XPK] Verifying if CoreDNS is available...
35+
[XPK] Task: `Wait for coredns available` is implemented by the following command not running since it is a dry run.
36+
kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-system --timeout=240s
37+
[XPK] CoreDNS has successfully started and passed verification.
38+
[XPK] CoreDNS deployment 'coredns' found in namespace 'kube-system'.
39+
[XPK] Skipping CoreDNS deployment since it already exists.
40+
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
41+
gcloud beta container clusters describe golden-cluster --location us-central1 --project golden-project --format="value(currentMasterVersion)"
42+
[XPK] Creating 1 node pool or pools of tpu7x-8
43+
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, requires_workload_policy=True)
44+
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
45+
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
46+
[XPK] Creating 1 node pool or pools of tpu7x-8
47+
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, requires_workload_policy=True)
48+
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
49+
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --location=us-central1 --format="value(locations)"
50+
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
51+
kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="ConfigData:data" --no-headers=true
52+
[XPK] Existing node pool names ['0']
53+
[XPK] Task: `Retrieve resource policy` is implemented by the following command not running since it is a dry run.
54+
gcloud compute resource-policies describe tpu7x-8-2x2x1-placement-policy --project=golden-project --region=us-central1
55+
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --location=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --placement-policy=tpu7x-8-2x2x1-placement-policy --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --max-pods-per-node 15
56+
[XPK] To complete NodepoolCreate-cpu-np we are executing gcloud beta container node-pools create cpu-np --node-version=0 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --location=us-central1 --num-nodes=1 --machine-type=n2-standard-64 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --enable-autoscaling --min-nodes=1 --max-nodes=20
57+
[XPK] Breaking up a total of 2 commands into 1 batches
58+
[XPK] Pretending all the jobs succeeded
59+
[XPK] Create or delete node pool request complete.
60+
[XPK] Enabling Autoprovisioning
61+
[XPK] Default Chips quota is minimum: 0, maximum: 4.
62+
[XPK] Chips quota is minimum: 0, maximum: 4. XPK will autoprovision 4 chips based on incoming workload requests, keeping at least 0 available at all times, and maximum of 4. If the difference (4 chips) is small, rescaling will not work well.
63+
[XPK] Task: `Update cluster with autoprovisioning enabled` is implemented by the following command not running since it is a dry run.
64+
gcloud container clusters update golden-cluster --project=golden-project --location=us-central1 --enable-autoprovisioning --autoprovisioning-config-file 6062bfee91f21efca86f2c3261129f06b1896ad9b68d2ecdba9589bea9e15ddf
65+
[XPK] Task: `Update cluster with autoscaling-profile` is implemented by the following command not running since it is a dry run.
66+
gcloud container clusters update golden-cluster --project=golden-project --location=us-central1 --autoscaling-profile=optimize-utilization
67+
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
68+
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)"
69+
[XPK] Breaking up a total of 0 commands into 0 batches
70+
[XPK] Pretending all the jobs succeeded
71+
[XPK] Creating ConfigMap for cluster
72+
[XPK] Breaking up a total of 2 commands into 1 batches
73+
[XPK] Pretending all the jobs succeeded
74+
[XPK] Enabling the jobset API on our cluster, to be deprecated when Jobset is globally available
75+
[XPK] Try 1: Install Jobset on golden-cluster
76+
[XPK] Task: `Install Jobset on golden-cluster` is implemented by the following command not running since it is a dry run.
77+
kubectl apply --server-side --force-conflicts -f https://github.com/kubernetes-sigs/jobset/releases/download/v0.8.0/manifests.yaml
78+
[XPK] Task: `Count total nodes` is implemented by the following command not running since it is a dry run.
79+
kubectl get node --no-headers | wc -l
80+
[XPK] Try 1: Updating jobset Controller Manager resources
81+
[XPK] Task: `Updating jobset Controller Manager resources` is implemented by the following command not running since it is a dry run.
82+
kubectl apply -f 1b31e624e490f9c8c4ef4e369f08d3fa467990af5a261e4405bd045265d70e95
83+
[XPK] Try 1: Install PathwaysJob on golden-cluster
84+
[XPK] Task: `Install PathwaysJob on golden-cluster` is implemented by the following command not running since it is a dry run.
85+
kubectl apply --server-side -f https://github.com/google/pathways-job/releases/download/v0.1.4/install.yaml
86+
[XPK] Enabling Kueue on the cluster
87+
[XPK] Task: `Get kueue version on server` is implemented by the following command not running since it is a dry run.
88+
kubectl get deployment kueue-controller-manager -n kueue-system -o jsonpath='{.spec.template.spec.containers[0].image}'
89+
[XPK] Installing Kueue version v0.14.3...
90+
[XPK] Try 1: Install Kueue
91+
[XPK] Task: `Install Kueue` is implemented by the following command not running since it is a dry run.
92+
kubectl apply --server-side --force-conflicts -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.14.3/manifests.yaml
93+
[XPK] Task: `Wait for Kueue to be available` is implemented by the following command not running since it is a dry run.
94+
kubectl wait deploy/kueue-controller-manager -n kueue-system --for=condition=available --timeout=10m
95+
[XPK] Applying following Kueue resources:
96+
apiVersion: kueue.x-k8s.io/v1beta1
97+
kind: ResourceFlavor
98+
metadata:
99+
name: "1xtpu7x-8"
100+
spec:
101+
nodeLabels: {"cloud.google.com/gke-tpu-accelerator": "tpu7x"}
102+
103+
---
104+
105+
apiVersion: kueue.x-k8s.io/v1beta1
106+
kind: ResourceFlavor
107+
metadata:
108+
name: "cpu-user"
109+
spec:
110+
nodeLabels: {"cloud.google.com/gke-nodepool": "cpu-np"}
111+
112+
---
113+
114+
apiVersion: kueue.x-k8s.io/v1beta1
115+
kind: AdmissionCheck
116+
metadata:
117+
name: dws-prov
118+
spec:
119+
controllerName: kueue.x-k8s.io/provisioning-request
120+
parameters:
121+
apiGroup: kueue.x-k8s.io
122+
kind: ProvisioningRequestConfig
123+
name: dws-config
124+
---
125+
apiVersion: kueue.x-k8s.io/v1beta1
126+
kind: ProvisioningRequestConfig
127+
metadata:
128+
name: dws-config
129+
spec:
130+
provisioningClassName: queued-provisioning.gke.io
131+
podSetUpdates:
132+
nodeSelector:
133+
- key: autoscaling.gke.io/provisioning-request
134+
valueFromProvisioningClassDetail: ResizeRequestName
135+
managedResources:
136+
- google.com/tpu
137+
---
138+
apiVersion: kueue.x-k8s.io/v1beta1
139+
kind: ClusterQueue
140+
metadata:
141+
name: "cluster-queue"
142+
spec:
143+
preemption:
144+
reclaimWithinCohort: Never # Don't preempt other queues in the cohort.
145+
withinClusterQueue: LowerPriority
146+
namespaceSelector: {} # match all.
147+
resourceGroups: [{'coveredResources': ['google.com/tpu'], 'flavors': [{'name': '1xtpu7x-8', 'resources': [{'name': 'google.com/tpu', 'nominalQuota': 4}]}]}, {'coveredResources': ['cpu', 'memory'], 'flavors': [{'name': 'cpu-user', 'resources': [{'name': 'cpu', 'nominalQuota': 480}, {'name': 'memory', 'nominalQuota': '2000G'}]}]}]
148+
149+
---
150+
apiVersion: kueue.x-k8s.io/v1beta1
151+
kind: LocalQueue
152+
metadata:
153+
namespace: default
154+
name: multislice-queue
155+
spec:
156+
clusterQueue: cluster-queue
157+
---
158+
apiVersion: scheduling.k8s.io/v1
159+
kind: PriorityClass
160+
metadata:
161+
name: very-low
162+
value: 100
163+
globalDefault: false
164+
description: "Very Low"
165+
---
166+
apiVersion: scheduling.k8s.io/v1
167+
kind: PriorityClass
168+
metadata:
169+
name: low
170+
value: 250
171+
globalDefault: false
172+
description: "Low"
173+
---
174+
apiVersion: scheduling.k8s.io/v1
175+
kind: PriorityClass
176+
metadata:
177+
name: medium
178+
value: 500
179+
globalDefault: false
180+
description: "Medium"
181+
---
182+
apiVersion: scheduling.k8s.io/v1
183+
kind: PriorityClass
184+
metadata:
185+
name: high
186+
value: 750
187+
globalDefault: false
188+
description: "High"
189+
---
190+
apiVersion: scheduling.k8s.io/v1
191+
kind: PriorityClass
192+
metadata:
193+
name: very-high
194+
value: 1000
195+
globalDefault: false
196+
description: "Very High"
197+
[XPK] Task: `Applying Kueue Custom Resources` is implemented by the following command not running since it is a dry run.
198+
kubectl apply -f f89effb1f55aef327018037d75f743b5c62d59f1f62fddadaaa31f72e5e07bdf
199+
[XPK] Task: `Count total nodes` is implemented by the following command not running since it is a dry run.
200+
kubectl get node --no-headers | wc -l
201+
[XPK] Try 1: Updating Kueue Controller Manager resources
202+
[XPK] Task: `Updating Kueue Controller Manager resources` is implemented by the following command not running since it is a dry run.
203+
kubectl patch deployment kueue-controller-manager -n kueue-system --type='strategic' --patch='{"spec": {"template": {"spec": {"containers": [{"name": "manager", "resources": {"limits": {"memory": "4096Mi"}}}]}}}}'
204+
[XPK] Verifying kjob installation
205+
[XPK] Task: `Verify kjob installation ` is implemented by the following command not running since it is a dry run.
206+
kubectl-kjob help
207+
[XPK] kjob found
208+
[XPK] Applying kjob CDRs
209+
[XPK] Task: `Create kjob CRDs on cluster` is implemented by the following command not running since it is a dry run.
210+
kubectl kjob printcrds | kubectl apply --server-side -f -
211+
[XPK] Creating kjob CRDs succeeded
212+
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
213+
kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="ConfigData:data" --no-headers=true
214+
[XPK] Task: `Creating JobTemplate` is implemented by the following command not running since it is a dry run.
215+
kubectl apply -f 4abb796ed6e7c9d7256a51f13124efd989fc12ee83839bed432fcf7d64f68e61
216+
[XPK] Task: `Creating PodTemplate` is implemented by the following command not running since it is a dry run.
217+
kubectl apply -f a63aa3c4593c38ad90671fd8b067d1886f6313ad558379b364b51791aa50f4e8
218+
[XPK] Task: `Creating AppProfile` is implemented by the following command not running since it is a dry run.
219+
kubectl apply -f 1d13ddebae3c90a05ba26b312df088982dd0df0edc4f4013b88384e476c20486
220+
[XPK] GKE commands done! Resources are created.
221+
[XPK] See your GKE Cluster here: https://console.cloud.google.com/kubernetes/clusters/details/us-central1/golden-cluster/details?project=golden-project
222+
[XPK] Exiting XPK cleanly

0 commit comments

Comments
 (0)