Skip to content

Commit 54308f3

Browse files
authored
Merge pull request #72 from chaitanyaenr/services_per_ns
Add services per ns cluster limits workload
2 parents b7338dc + 8fec7d9 commit 54308f3

File tree

6 files changed

+487
-0
lines changed

6 files changed

+487
-0
lines changed

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
| [PVCscale](pvscale.md) | PVCScale test | Working storageclass |
1616
| [Conformance](conformance.md) | OCP/Kubernetes e2e tests | None |
1717
| [Namespaces per cluster](namespaces-per-cluster.md) | Maximum Namespaces | None |
18+
| [Services per namespace](services-per-namespace.md) | Maximum services per namespace | None |
1819

1920
* Baseline job without a tooled cluster just idles a cluster. The goal is to capture resource consumption over a period of time to characterize resource requirements thus tooling is required. (For now)
2021

@@ -45,3 +46,4 @@ Each workload will implement a form of pass/fail criteria in order to flag if th
4546
| [PVCscale](pvscale.md) | No |
4647
| [Conformance](conformance.md) | No |
4748
| [Namespaces per cluster](namespaces-per-cluster.md) | Yes: Exit code, Test Duration |
49+
| [Services per namespace](services-per-namespace.md) | Yes: Exit code, Test Duration |

docs/services-per-namespace.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# Services per namespace Workload
2+
3+
The services per namespace workload playbook is `workloads/cluster-limits-services-per-namespace.yml` and will run the services per namespace workload on your cluster.
4+
5+
Services per namespace is a cluster limits focused test which creates maximum possible services per each namespace.
6+
7+
Running from CLI:
8+
9+
```sh
10+
$ cp workloads/inventory.example inventory
11+
$ # Add orchestration host to inventory
12+
$ # Edit vars in workloads/vars/services-per-namespace.yml or define Environment vars (See below)
13+
$ time ansible-playbook -vv -i inventory workloads/cluster-limits-services-per-namespace.yml
14+
```
15+
16+
## Environment variables
17+
18+
### PUBLIC_KEY
19+
Default: `~/.ssh/id_rsa.pub`
20+
Public ssh key file for Ansible.
21+
22+
### PRIVATE_KEY
23+
Default: `~/.ssh/id_rsa`
24+
Private ssh key file for Ansible.
25+
26+
### ORCHESTRATION_USER
27+
Default: `root`
28+
User for Ansible to log in as. Must authenticate with PUBLIC_KEY/PRIVATE_KEY.
29+
30+
### WORKLOAD_IMAGE
31+
Default: `quay.io/openshift-scale/scale-ci-workload`
32+
Container image that runs the workload script.
33+
34+
### WORKLOAD_JOB_NODE_SELECTOR
35+
Default: `false`
36+
Enables/disables the node selector that places the workload job on the `workload` node.
37+
38+
### WORKLOAD_JOB_TAINT
39+
Default: `false`
40+
Enables/disables the toleration on the workload job to permit the `workload` taint.
41+
42+
### WORKLOAD_JOB_PRIVILEGED
43+
Default: `false`
44+
Enables/disables running the workload pod as privileged.
45+
46+
### KUBECONFIG_FILE
47+
Default: `~/.kube/config`
48+
Location of kubeconfig on orchestration host.
49+
50+
### PBENCH_INSTRUMENTATION
51+
Default: `false`
52+
Enables/disables running the workload wrapped by pbench-user-benchmark. When enabled, pbench agents can then be enabled (`ENABLE_PBENCH_AGENTS`) for further instrumentation data and pbench-copy-results can be enabled (`ENABLE_PBENCH_COPY`) to export captured data for further analysis.
53+
54+
### ENABLE_PBENCH_AGENTS
55+
Default: `false`
56+
Enables/disables the collection of pbench data on the pbench agent Pods. These Pods are deployed by the tooling playbook.
57+
58+
### ENABLE_PBENCH_COPY
59+
Default: `false`
60+
Enables/disables the copying of pbench data to a remote results server for further analysis.
61+
62+
### PBENCH_SSH_PRIVATE_KEY_FILE
63+
Default: `~/.ssh/id_rsa`
64+
Location of ssh private key to authenticate to the pbench results server.
65+
66+
### PBENCH_SSH_PUBLIC_KEY_FILE
67+
Default: `~/.ssh/id_rsa.pub`
68+
Location of the ssh public key to authenticate to the pbench results server.
69+
70+
### PBENCH_SERVER
71+
Default: There is no public default.
72+
DNS address of the pbench results server.
73+
74+
### SCALE_CI_RESULTS_TOKEN
75+
Default: There is no public default.
76+
Future use for pbench and prometheus scraper to place results into git repo that holds results data.
77+
78+
### JOB_COMPLETION_POLL_ATTEMPTS
79+
Default: `360`
80+
Number of retries for Ansible to poll if the workload job has completed. Poll attempts delay 10s between polls with some additional time taken for each polling action depending on the orchestration host setup.
81+
82+
### SERVICES_PER_NAMESPACE_TEST_PREFIX
83+
Default: `services_per_namespace`
84+
Test to prefix the pbench results.
85+
86+
### SERVICES_PER_NAMESPACE_CLEANUP
87+
Default: `true`
88+
Enables/disables cluster loader cleanup of this workload on completion.
89+
90+
### SERVICES_PER_NAMESPACE_BASENAME
91+
Default: `services-per-namespace`
92+
Basename used by cluster loader for the project(s) it creates.
93+
94+
### SERVICES_PER_NAMESPACE_PROJECTS
95+
Default: `2`
96+
Maximum number of projects that will be created by the services per namespace workload. Typically much higher values are used than the default for large scale tests.
97+
98+
### SERVICES_PER_NAMESPACE_COUNT
99+
Default: `5000`
100+
Maximum number of services per namespace that will be created by the services per namespace workload.
101+
102+
### EXPECTED_SERVICES_PER_NAMESPACE_DURATION
103+
Default: `600`
104+
Pass/fail criteria. Value to determine if Namespaces per cluster workload executed in duration expected.
105+
106+
## Smoke test variables
107+
108+
```
109+
SERVICES_PER_NAMESPACE_TEST_PREFIX=services_per_namespace_smoke
110+
SERVICES_PER_NAMESPACE_CLEANUP=true
111+
SERVICES_PER_NAMESPACE_BASENAME=services-per-namespace
112+
SERVICES_PER_NAMESPACE_PROJECTS=2
113+
SERVICES_PER_NAMESPACE_COUNT=100
114+
```
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
---
2+
#
3+
# Runs services per namespace limits test on an existing OCP cluster
4+
#
5+
6+
- name: Runs services per namespace on a RHCOS cluster
7+
hosts: orchestration
8+
gather_facts: true
9+
remote_user: "{{orchestration_user}}"
10+
vars_files:
11+
- vars/services-per-namespace.yml
12+
vars:
13+
workload_job: "services-per-namespace"
14+
tasks:
15+
- name: Create scale-ci-tooling directory
16+
file:
17+
path: "{{ansible_user_dir}}/scale-ci-tooling"
18+
state: directory
19+
20+
- name: Copy workload files
21+
copy:
22+
src: "{{item.src}}"
23+
dest: "{{item.dest}}"
24+
with_items:
25+
- src: scale-ci-tooling-ns.yml
26+
dest: "{{ansible_user_dir}}/scale-ci-tooling/scale-ci-tooling-ns.yml"
27+
- src: workload-services-per-namespace-script-cm.yml
28+
dest: "{{ansible_user_dir}}/scale-ci-tooling/workload-services-per-namespace-script-cm.yml"
29+
30+
- name: Slurp kubeconfig file
31+
slurp:
32+
src: "{{kubeconfig_file}}"
33+
register: kubeconfig_file_slurp
34+
35+
- name: Slurp ssh private key file
36+
slurp:
37+
src: "{{pbench_ssh_private_key_file}}"
38+
register: pbench_ssh_private_key_file_slurp
39+
40+
- name: Slurp ssh public key file
41+
slurp:
42+
src: "{{pbench_ssh_public_key_file}}"
43+
register: pbench_ssh_public_key_file_slurp
44+
45+
- name: Template workload templates
46+
template:
47+
src: "{{item.src}}"
48+
dest: "{{item.dest}}"
49+
with_items:
50+
- src: pbench-cm.yml.j2
51+
dest: "{{ansible_user_dir}}/scale-ci-tooling/pbench-cm.yml"
52+
- src: pbench-ssh-secret.yml.j2
53+
dest: "{{ansible_user_dir}}/scale-ci-tooling/pbench-ssh-secret.yml"
54+
- src: kubeconfig-secret.yml.j2
55+
dest: "{{ansible_user_dir}}/scale-ci-tooling/kubeconfig-secret.yml"
56+
- src: workload-job.yml.j2
57+
dest: "{{ansible_user_dir}}/scale-ci-tooling/workload-job.yml"
58+
- src: workload-env.yml.j2
59+
dest: "{{ansible_user_dir}}/scale-ci-tooling/workload-services-per-namespace-env.yml"
60+
61+
- name: Check if scale-ci-tooling namespace exists
62+
shell: |
63+
oc get project scale-ci-tooling
64+
ignore_errors: true
65+
changed_when: false
66+
register: scale_ci_tooling_ns_exists
67+
68+
- name: Ensure any stale scale-ci-services-per-namespace job is deleted
69+
shell: |
70+
oc delete job scale-ci-services-per-namespace -n scale-ci-tooling
71+
register: scale_ci_tooling_project
72+
failed_when: scale_ci_tooling_project.rc == 0
73+
until: scale_ci_tooling_project.rc == 1
74+
retries: 60
75+
delay: 1
76+
when: scale_ci_tooling_ns_exists.rc == 0
77+
78+
- name: Block for non-existing tooling namespace
79+
block:
80+
- name: Create tooling namespace
81+
shell: |
82+
oc create -f {{ansible_user_dir}}/scale-ci-tooling/scale-ci-tooling-ns.yml
83+
84+
- name: Create tooling service account
85+
shell: |
86+
oc create serviceaccount useroot -n scale-ci-tooling
87+
oc adm policy add-scc-to-user privileged -z useroot -n scale-ci-tooling
88+
when: enable_pbench_agents|bool
89+
when: scale_ci_tooling_ns_exists.rc != 0
90+
91+
- name: Create/replace kubeconfig secret
92+
shell: |
93+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/kubeconfig-secret.yml"
94+
95+
- name: Create/replace the pbench configmap
96+
shell: |
97+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/pbench-cm.yml"
98+
99+
- name: Create/replace pbench ssh secret
100+
shell: |
101+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/pbench-ssh-secret.yml"
102+
103+
- name: Create/replace workload script configmap
104+
shell: |
105+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/workload-services-per-namespace-script-cm.yml"
106+
107+
- name: Create/replace workload script environment configmap
108+
shell: |
109+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/workload-services-per-namespace-env.yml"
110+
111+
- name: Create/replace workload job to that runs workload script
112+
shell: |
113+
oc replace --force -n scale-ci-tooling -f "{{ansible_user_dir}}/scale-ci-tooling/workload-job.yml"
114+
115+
- name: Poll until job pod is running
116+
shell: |
117+
oc get pods --selector=job-name=scale-ci-services-per-namespace -n scale-ci-tooling -o json
118+
register: pod_json
119+
retries: 60
120+
delay: 2
121+
until: pod_json.stdout | from_json | json_query('items[0].status.phase==`Running`')
122+
123+
- name: Poll until job is complete
124+
shell: |
125+
oc get job scale-ci-services-per-namespace -n scale-ci-tooling -o json
126+
register: job_json
127+
retries: "{{job_completion_poll_attempts}}"
128+
delay: 10
129+
until: job_json.stdout | from_json | json_query('status.succeeded==`1` || status.failed==`1`')
130+
failed_when: job_json.stdout | from_json | json_query('status.succeeded==`1`') == false
131+
when: job_completion_poll_attempts|int > 0

0 commit comments

Comments
 (0)