Skip to content

Commit 47a838b

Browse files
committed
Add namespaces per cluster limits test workload
1 parent 9f23952 commit 47a838b

File tree

6 files changed

+542
-0
lines changed

6 files changed

+542
-0
lines changed

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
| [Deployments Per Namespace](deployments-per-ns.md) | Maximum Deployments | None |
1515
| [PVCscale](pvscale.md) | PVCScale test | Working storageclass |
1616
| [Conformance](conformance.md) | OCP/Kubernetes e2e tests | None |
17+
| [Namespaces per cluster](namespaces-per-cluster.md) | Maximum Namespaces | None |
1718

1819
* Baseline job without a tooled cluster just idles a cluster. The goal is to capture resource consumption over a period of time to characterize resource requirements thus tooling is required. (For now)
1920

@@ -43,3 +44,4 @@ Each workload will implement a form of pass/fail criteria in order to flag if th
4344
| [Deployments Per Namespace](deployments-per-ns.md) | No |
4445
| [PVCscale](pvscale.md) | No |
4546
| [Conformance](conformance.md) | No |
47+
| [Namespaces per cluster](namespaces-per-cluster.md) | Yes: Exit code, Test Duration |

docs/namespaces-per-cluster.md

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# Namespaces per cluster Workload
2+
3+
The Namespaces per cluster workload playbook is `workloads/cluster-limits-namespaces-per-cluster.yml` and will run the namespaces per cluster workload on your cluster.
4+
5+
Namespaces per cluster test purpose is a cluster limits focused test which creates maximum possible namespaces across the cluster.
6+
7+
Running from CLI:
8+
9+
```sh
10+
$ cp workloads/inventory.example inventory
11+
$ # Add orchestration host to inventory
12+
$ # Edit vars in workloads/vars/namespaces-per-cluster.yml or define Environment vars (See below)
13+
$ time ansible-playbook -vv -i inventory workloads/cluster-limits-namespaces-per-cluster.yml
14+
```
15+
16+
## Environment variables
17+
18+
### PUBLIC_KEY
19+
Default: `~/.ssh/id_rsa.pub`
20+
Public ssh key file for Ansible.
21+
22+
### PRIVATE_KEY
23+
Default: `~/.ssh/id_rsa`
24+
Private ssh key file for Ansible.
25+
26+
### ORCHESTRATION_USER
27+
Default: `root`
28+
User for Ansible to log in as. Must authenticate with PUBLIC_KEY/PRIVATE_KEY.
29+
30+
### WORKLOAD_IMAGE
31+
Default: `quay.io/openshift-scale/scale-ci-workload`
32+
Container image that runs the workload script.
33+
34+
### WORKLOAD_JOB_NODE_SELECTOR
35+
Default: `false`
36+
Enables/disables the node selector that places the workload job on the `workload` node.
37+
38+
### WORKLOAD_JOB_TAINT
39+
Default: `false`
40+
Enables/disables the toleration on the workload job to permit the `workload` taint.
41+
42+
### WORKLOAD_JOB_PRIVILEGED
43+
Default: `false`
44+
Enables/disables running the workload pod as privileged.
45+
46+
### KUBECONFIG_FILE
47+
Default: `~/.kube/config`
48+
Location of kubeconfig on orchestration host.
49+
50+
### PBENCH_INSTRUMENTATION
51+
Default: `false`
52+
Enables/disables running the workload wrapped by pbench-user-benchmark. When enabled, pbench agents can then be enabled (`ENABLE_PBENCH_AGENTS`) for further instrumentation data and pbench-copy-results can be enabled (`ENABLE_PBENCH_COPY`) to export captured data for further analysis.
53+
54+
### ENABLE_PBENCH_AGENTS
55+
Default: `false`
56+
Enables/disables the collection of pbench data on the pbench agent Pods. These Pods are deployed by the tooling playbook.
57+
58+
### ENABLE_PBENCH_COPY
59+
Default: `false`
60+
Enables/disables the copying of pbench data to a remote results server for further analysis.
61+
62+
### PBENCH_SSH_PRIVATE_KEY_FILE
63+
Default: `~/.ssh/id_rsa`
64+
Location of ssh private key to authenticate to the pbench results server.
65+
66+
### PBENCH_SSH_PUBLIC_KEY_FILE
67+
Default: `~/.ssh/id_rsa.pub`
68+
Location of the ssh public key to authenticate to the pbench results server.
69+
70+
### PBENCH_SERVER
71+
Default: There is no public default.
72+
DNS address of the pbench results server.
73+
74+
### SCALE_CI_RESULTS_TOKEN
75+
Default: There is no public default.
76+
Future use for pbench and prometheus scraper to place results into git repo that holds results data.
77+
78+
### JOB_COMPLETION_POLL_ATTEMPTS
79+
Default: `360`
80+
Number of retries for Ansible to poll if the workload job has completed. Poll attempts delay 10s between polls with some additional time taken for each polling action depending on the orchestration host setup.
81+
82+
### NAMESPACES_PER_CLUSTER_TEST_PREFIX
83+
Default: `namespaces_per_cluster`
84+
Test to prefix the pbench results.
85+
86+
### NAMESPACES_PER_CLUSTER_CLEANUP
87+
Default: `true`
88+
Enables/disables cluster loader cleanup of this workload on completion.
89+
90+
### NAMESPACES_PER_CLUSTER_BASENAME
91+
Default: `namespaces-per-cluster`
92+
Basename used by cluster loader for the project(s) it creates.
93+
94+
### NAMESPACES_PER_CLUSTER_COUNT
95+
Default: `1000`
96+
Maximum number of projects that will be created by the Namespaces per cluster workload. Typically much higher values are used than the default for large scale tests.
97+
98+
### EXPECTED_NAMESPACES_PER_CLUSTER_DURATION
99+
Default: `600`
100+
Pass/fail criteria. Value to determine if Namespaces per cluster workload executed in duration expected.
101+
102+
## Smoke test variables
103+
104+
```
105+
NAMESPACES_PER_CLUSTER_TEST_PREFIX=namespaces_per_cluster_smoke
106+
NAMESPACES_PER_CLUSTER_CLEANUP=true
107+
NAMESPACES_PER_CLUSTER_BASENAME=namespaces-per-cluster
108+
NAMESPACES_PER_CLUSTER_COUNT=10
109+
```
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
---
2+
#
3+
# Runs Namespaces per cluster limits test on an existing RHCOS cluster
4+
#
5+
6+
- name: Runs Namespaces per cluster on a RHCOS cluster
7+
hosts: orchestration
8+
gather_facts: true
9+
remote_user: "{{orchestration_user}}"
10+
vars_files:
11+
- vars/namespaces-per-cluster.yml
12+
vars:
13+
workload_job: "namespaces-per-cluster"
14+
tasks:
15+
- name: Create scale-ci-tooling directory
16+
file:
17+
path: "{{ansible_user_dir}}/scale-ci-tooling"
18+
state: directory
19+
20+
- name: Copy workload files
21+
copy:
22+
src: "{{item.src}}"
23+
dest: "{{item.dest}}"
24+
with_items:
25+
- src: scale-ci-tooling-ns.yml
26+
dest: "{{ansible_user_dir}}/scale-ci-tooling/scale-ci-tooling-ns.yml"
27+
- src: workload-namespaces-per-cluster-script-cm.yml
28+
dest: "{{ansible_user_dir}}/scale-ci-tooling/workload-namespaces-per-cluster-script-cm.yml"
29+
30+
- name: Slurp kubeconfig file
31+
slurp:
32+
src: "{{kubeconfig_file}}"
33+
register: kubeconfig_file_slurp
34+
35+
- name: Slurp ssh private key file
36+
slurp:
37+
src: "{{pbench_ssh_private_key_file}}"
38+
register: pbench_ssh_private_key_file_slurp
39+
40+
- name: Slurp ssh public key file
41+
slurp:
42+
src: "{{pbench_ssh_public_key_file}}"
43+
register: pbench_ssh_public_key_file_slurp
44+
45+
- name: Template workload templates
46+
template:
47+
src: "{{item.src}}"
48+
dest: "{{item.dest}}"
49+
with_items:
50+
- src: pbench-cm.yml.j2
51+
dest: "{{ansible_user_dir}}/scale-ci-tooling/pbench-cm.yml"
52+
- src: pbench-ssh-secret.yml.j2
53+
dest: "{{ansible_user_dir}}/scale-ci-tooling/pbench-ssh-secret.yml"
54+
- src: kubeconfig-secret.yml.j2
55+
dest: "{{ansible_user_dir}}/scale-ci-tooling/kubeconfig-secret.yml"
56+
- src: workload-job.yml.j2
57+
dest: "{{ansible_user_dir}}/scale-ci-tooling/workload-job.yml"
58+
- src: workload-env.yml.j2
59+
dest: "{{ansible_user_dir}}/scale-ci-tooling/workload-namespaces-per-cluster-env.yml"
60+
61+
- name: Check if scale-ci-tooling namespace exists
62+
shell: |
63+
oc get project scale-ci-tooling
64+
ignore_errors: true
65+
changed_when: false
66+
register: scale_ci_tooling_ns_exists
67+
68+
- name: Ensure any stale scale-ci-namespaces-per-cluster job is deleted
69+
shell: |
70+
oc delete job scale-ci-namespaces-per-cluster -n scale-ci-tooling
71+
register: scale_ci_tooling_project
72+
failed_when: scale_ci_tooling_project.rc == 0
73+
until: scale_ci_tooling_project.rc == 1
74+
retries: 60
75+
delay: 1
76+
when: scale_ci_tooling_ns_exists.rc == 0
77+
78+
- name: Block for non-existing tooling namespace
79+
block:
80+
- name: Create tooling namespace
81+
shell: |
82+
oc create -f {{ansible_user_dir}}/scale-ci-tooling/scale-ci-tooling-ns.yml
83+
84+
- name: Create tooling service account
85+
shell: |
86+
oc create serviceaccount useroot -n scale-ci-tooling
87+
oc adm policy add-scc-to-user privileged -z useroot -n scale-ci-tooling
88+
when: enable_pbench_agents|bool
89+
when: scale_ci_tooling_ns_exists.rc != 0
90+
91+
- name: Create/replace kubeconfig secret
92+
shell: |
93+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/kubeconfig-secret.yml"
94+
95+
- name: Create/replace the pbench configmap
96+
shell: |
97+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/pbench-cm.yml"
98+
99+
- name: Create/replace pbench ssh secret
100+
shell: |
101+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/pbench-ssh-secret.yml"
102+
103+
- name: Create/replace workload script configmap
104+
shell: |
105+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/workload-namespaces-per-cluster-script-cm.yml"
106+
107+
- name: Create/replace workload script environment configmap
108+
shell: |
109+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/workload-namespaces-per-cluster-env.yml"
110+
111+
- name: Create/replace workload job to that runs workload script
112+
shell: |
113+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/workload-job.yml"
114+
115+
- name: Poll until job pod is running
116+
shell: |
117+
oc get pods --selector=job-name=scale-ci-namespaces-per-cluster -n scale-ci-tooling -o json
118+
register: pod_json
119+
retries: 60
120+
delay: 2
121+
until: pod_json.stdout | from_json | json_query('items[0].status.phase==`Running`')
122+
123+
- name: Poll until job is complete
124+
shell: |
125+
oc get job scale-ci-namespaces-per-cluster -n scale-ci-tooling -o json
126+
register: job_json
127+
retries: "{{job_completion_poll_attempts}}"
128+
delay: 10
129+
until: job_json.stdout | from_json | json_query('status.succeeded==`1` || status.failed==`1`')
130+
failed_when: job_json.stdout | from_json | json_query('status.succeeded==`1`') == false
131+
when: job_completion_poll_attempts|int > 0

0 commit comments

Comments
 (0)