Skip to content

Commit 440ecef

Browse files
authored
Added Readme for test folder (#1399)
* Added Readme for test folder * Added test details * Improved Teams post with logs * updated readme * Resolved comments * fix devskim build
1 parent df933d5 commit 440ecef

File tree

5 files changed

+258
-14
lines changed

5 files changed

+258
-14
lines changed

.github/workflows/devskim.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ on:
1212
jobs:
1313
lint:
1414
name: DevSkim
15-
runs-on: ubuntu-20.04
15+
runs-on: ubuntu-latest
1616
permissions:
1717
actions: read
1818
contents: read

test/README.md

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# File Directory Structure
2+
```
3+
├── test - e2e test suites to run on clusters. Unit tests are included alongside the golang files.
4+
│ ├── README.md - Info about setting up, writing, and running the tests.
5+
│ ├── containerlog-scale-tests - Contains YAML files for log scale testing and scripts for deployment and cleanup.
6+
│ │ ├── 400logspersec-2klogentrysize.yaml
7+
│ │ ├── 400logspersec-5klogentrysize.yaml
8+
│ │ ├── ci-log-scale-4kpersec-5klogline.yaml
9+
│ │ ├── cleanup.sh
10+
│ │ ├── containerlogv2/ - Subdirectory for container log v2 specific tests.
11+
│ │ ├── deploy.sh
12+
│ │ ├── log-generator-job-app.yaml
13+
│ ├── e2e - End-to-end test configurations and source files for Azure ARC conformance testing.
14+
│ │ ├── conformance.yaml - Configuration and image info for ARC conformance test.
15+
│ │ ├── e2e-tests.yaml - Tests for conformance validation.
16+
│ │ ├── src/ - Source code for e2e tests.
17+
│ ├── fluent-bit-windows - Fluent Bit configuration for Windows.
18+
│ │ ├── fluent-bit-windows.yaml
19+
│ ├── ginkgo-e2e - Ginkgo-based e2e test utilities and configurations.
20+
│ │ ├── containerstatus/ - Test container logs have no errors, containers are running, and all processes are running.
21+
│ │ ├── livenessprobe/ - Test that the pods detect and restart when a process is not running.
22+
│ │ ├── querylogs/ - Test that the data is flowing to log analytics workspace as expected.
23+
│ │ ├── utils/ - Generalized utils functions for the test suites to use.
24+
│ ├── onboarding-templates-legacy-auth - Templates for onboarding with legacy authentication.
25+
│ │ ├── existingClusterOnboarding.json
26+
│ │ ├── existingClusterParam.json
27+
│ ├── prometheus-scraping - Prometheus scraping configurations and reference apps.
28+
│ │ ├── prom-service-for-rs-scraping.yaml
29+
│ │ ├── prometheus-reference-app.yaml
30+
│ │ ├── win-prometheus-ref-app-ltsc2019.yml
31+
│ │ ├── win-prometheus-ref-app-ltsc2022.yml
32+
│ ├── scenario - Scenario-based test configurations and YAML files.
33+
│ │ ├── log-app-win-ltsc2019.yml
34+
│ │ ├── log-app-win-ltsc2022.yml
35+
│ │ ├── log-generator-app.yaml
36+
│ │ ├── multiline/ - Subdirectory for multiline log tests.
37+
│ │ ├── yamls/ - Subdirectory for additional YAML configurations.
38+
│ ├── testkube - Testkube-related configurations and scripts.
39+
│ │ ├── api-server-permissions.yaml - Permissions for the TestKube runner pods to call the API server.
40+
│ │ ├── custom-job-template.yaml - Custom job template which makes sure testkube executors runs only on linux nodes.
41+
│ │ ├── executors.json - Testkube executors used for ginkgo in compact json format. The base64 encoded string of this will be used in testkube helm chart.
42+
│ │ ├── helm-testkube-values.yaml - Customized testkube helm chart values which pulls the data from MCR and schedule all the pods on linux nodes.
43+
│ │ ├── install-and-execute-testkube-tests.sh - The script used to install and execute testkube tests on a given cluster. This is used in .pipelines\azure_pipeline_testframework.yaml.
44+
│ │ ├── testkube-test-crs.yaml - CRs for TestKube test suites and tests for AKS CI/CD clusters.
45+
│ ├── unit-tests - Unit test drivers and canned API responses.
46+
│ │ ├── canned-api-responses/ - Subdirectory for canned API responses.
47+
│ │ ├── run_go_tests.sh
48+
│ │ ├── run_ruby_tests.sh
49+
│ │ ├── test_driver.rb
50+
````
51+
52+
In this document, we will be covering ginkgo-e2e and testkube folders in detail. Support for more folders will be added soon.
53+
54+
# Current Tests
55+
- Container Status
56+
- All daemonset pods are scheduled on each node:
57+
- ama-logs
58+
- ama-logs-win for `label=windows`
59+
- Each Container on each pod that we deploy has status `Running`. Pods include:
60+
- ama-logs
61+
- ama-logs-rs
62+
- All expected processes are running on the containers on linux nodes:
63+
- fluent-bit
64+
- fluentd
65+
- mdsd
66+
- telegraf (only check for daemonset as it's always running in daemonset. It only runs in replicaset if agent config is deployed.)
67+
- All expected processes are running on the containers on windows nodes:
68+
- fluent-bit
69+
- MonAgentLauncher
70+
- MonAgentHost
71+
- MonAgentManager
72+
- MonAgentCore
73+
- The Logs of the container should not contain any error.
74+
75+
- Liveness Probe:
76+
- When following processes are not running in ama-logs and ama-logs-rs containers, the container should restart:
77+
- fluent-bit
78+
- fluentd
79+
- mdsd
80+
- For windows ama-logs-windows container, liveness probe monitos following processes:
81+
- fluent-bit
82+
- MonAgentLauncher
83+
84+
- Query Logs:
85+
- All tables should have logs in last 15 mins (configurable):
86+
- Perf
87+
- InsightsMetrics
88+
- ContainerLog (or ContainerLogV2 if configurable)
89+
- ContainerInventory
90+
- ContainerNodeInventory
91+
- KubeNodeInventory
92+
- KubePodInventory
93+
- KubePVInventory
94+
- ContainerInventory should not have any empty values in following columns:
95+
- Image
96+
- ImageID
97+
- ImageTag
98+
- Repository
99+
- Check that all pods and nodes data is following to respective tables:
100+
- Each pod should be present in KubePodInventory
101+
- Each node should be present in KubeNodeInventory
102+
103+
104+
105+
# Ginkgo
106+
Tests are run using the [Ginkgo](https://onsi.github.io/ginkgo/) test framework. This is built upon the regular go test framework. It's advantages are that it:
107+
- Has an easily readable test structure using the `Behavior-Driven Development` model that's used in many languages and is applicable outside of GoLang. This model follows a `Given..., When..., Then...` structure. This is implemented in Ginkgo using the `Describe()`, `Context()`, and `It()`/`Specify()` functions. The Ginkgo documentation on [Writing Specs](https://onsi.github.io/ginkgo/#writing-specs) has many examples of this.
108+
- Utilizes the [Gomega assertion package](https://onsi.github.io/gomega/) for easily understandable test failure errors with the goal that the output will tell you exactly what failed.
109+
- Has good support for parallelization and structuring which tests should be run in series and which can be run at the same time to speed up the tests.
110+
- Has extensive documentation and examples from OSS community.
111+
112+
Ginkgo can be used for any tests written in golang, whether they are unit, integration, or e2e tests.
113+
114+
## Bootstrap a Dev Cluster to Run Ginkgo Tests
115+
- Follow [this](https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-portal?tabs=azure-cli) to create a test cluster and connect to it.
116+
- Install [Ginkgo](https://onsi.github.io/ginkgo/#getting-started).
117+
- Navigate to any folder in ./ginkgo-e2e and run command `ginkgo` to trigger the tests on the running cluster.
118+
- Please note that you don't need testkube to be installed on the cluster to trigger the tests locally on a cluster.
119+
- You can customize which tests are run with `--label-filter`:
120+
- `--label-filter='!/./` is an expression that runs all tests that don't have a label.
121+
- `--label-filter='!/./ || LABELNAME` is an expression that runs all tests that don't have a label and tests that have the label `LABELNAME`.
122+
- `--label-filter='!(arc-extension,windows)'` is an expression that runs all tests, including those with labels, except for tests labeled `arc-extension` or `windows`.
123+
- To run only one package of tests, add the path to the tests in the command. For example, to only run the livenessprobe tests on your cluster:
124+
```
125+
ginkgo -p -r --keep-going ./livenessprobe
126+
```
127+
- For more uses of the Ginkgo CLI, refer to the [docs](https://onsi.github.io/ginkgo/#ginkgo-cli-overview).
128+
129+
130+
#### Packages
131+
- [k8s.io/client-go/kubernetes](https://pkg.go.dev/k8s.io/client-go/kubernetes)
132+
- [k8s.io/api/core/v1](https://pkg.go.dev/k8s.io/api/core/v1)
133+
- [azure-sdk-for-go](https://github.com/Azure/azure-sdk-for-go)
134+
135+
# TestKube
136+
[Testkube](https://docs.testkube.io/) is an OSS runner framework for running the tests inside a Kubernetes cluster. It is deployed as a helm release on the cluster. Ginkgo is included as one of the out-of-the-box executors supported.
137+
138+
Behind the scenes, tests and executors are custom resources. Running a test starts a job that deploys the test executor pod which runs the Ginkgo tests (or a different framework setup).
139+
140+
Some highlights are that:
141+
- Includes test history, pass rate, and execution times.
142+
- Friendly user interface and easy Golang integration with out-of-the-box Ginkgo runner.
143+
- A [Teams channel notification](https://docs.testkube.io/articles/webhooks#microsoft-teams) can integrated with testkube for notifying if a test failed. These tests can be run after every merge to main or scheduled to be run on an interval.
144+
- Test suites can be created out of tests with a dependency flowchart that can be set up for if some tests should run at the same time or after others, or only run if one succeeds.
145+
- There are many other test framework integrations including curl and postman for testing Kubernetes services and their APIs. There is also a k6 and jmeter integration for performance testing Kubernetes services.
146+
147+
148+
## Getting Started
149+
- Install the CLI on linux/WSL:
150+
```bash
151+
wget -qO - https://repo.testkube.io/key.pub | sudo apt-key add -
152+
echo "deb https://repo.testkube.io/linux linux main" | sudo tee -a /etc/apt/sources.list
153+
sudo apt-get update
154+
sudo apt-get install -y testkube
155+
```
156+
Other OS installation instructions are [here](https://docs.testkube.io/articles/install-cli/).
157+
- Install the [helm chart](https://docs.testkube.io/articles/helm-chart/) on your cluster:
158+
```bash
159+
cd ./testkube
160+
helm repo add kubeshop https://kubeshop.github.io/helm-charts
161+
helm repo update
162+
helm upgrade --install --create-namespace testkube kubeshop/testkube -n testkube -f ./helm-testkube-values.yaml
163+
```
164+
- The helm chart will install in the namespace `testkube`.
165+
- To uninstall testkube:
166+
```bash
167+
helm uninstall testkube -n testkube
168+
```
169+
- Create a test connected to the Github repository and branch. Tests are a custom resource behind the scenes and can be created with the CLI, or applying a CR. Create testkube tests/suites on the cluster:
170+
```bash
171+
cd ./testkube
172+
kubectl apply -f testkube-test-crs.yaml
173+
- Apply the yaml [api-server-permissions.yaml](./testkube/api-server-permissions.yaml) to update the permissions needed for the Ginkgo executor to be able to make calls to the API server:
174+
```bash
175+
cd ./testkube
176+
kubectl apply -f api-server-permissions.yaml
177+
```
178+
- Run the tests on the cluster using testkube:
179+
```bash
180+
cd ./testkube
181+
kubectl testkube run testsuite <test suite name> --job-template ./custom-job-template.yaml --verbose
182+
```
183+
To run querylogs, update tenant id and client id of managed identity which has "Log Analytics Reader" permission in testkube-test-crs.yaml and re-apply it to the cluster.
184+
The above commend will return exectuion id of the running tests/suite. You can watch the tests running or get the logs once the job is completed using:
185+
```bash
186+
kubectl testkube watch testsuiteexecution $execution_id
187+
188+
kubectl testkube get testsuiteexecution $execution_id
189+
```
190+
191+
## Issues and fixes for CICD clusters:
192+
This section is specific to CICD clusters setup for testing. Testkube installation was failing on CICD clusters due to following issues:
193+
1. Azure policy applied on the cluster only allows the images to be pulled from MCR or ACR.
194+
- [Fix]: Pulled all the images used in testkube (get the list from helm chart values) and uploaded to ACR which internally syncs to MCR. Changed the image's registry and repository to pull the images from MCR.
195+
- You would notice that some images have a tag in the format of image_tag i.e. mongodb_6.0.5-debian-11-r64, which means mongodb image with tag 6.0.5-debian-11-r64 was pulled from original repo and pushed to ACR, while the others are in the format similar to testkube-api-server which means the latest testkube-api-server image was pulled and pushed to ACR. This is decided based on the tag which was used in the helm chart values to pull the image from original repository (i.e docker).
196+
2. Testkube pods were getting scheduled on Windows node.
197+
- [Fix]: Testkube doesn't supprt windows node. Used nodeSelector setting to ensure the pods were scheduling on linux node always. Notice that 'nats' has a different format to ensure the pod was getting scheduled on linux nodes.
198+
3. Testkube executors getting scheduled on Windows node.
199+
- [Fix]: Create a json of all the executors required i.e. init-executor and ginkgo-executor from [supported executors](https://github.com/kubeshop/helm-charts/blob/ed3bf1ca91e7c50f582c8528c2b0531ec3a5e9ef/charts/testkube-api/templates/_executors.json.tpl). Create a base64-encoded string out of it and replace the value of "executors" attribute in the helm chart values.
200+
201+
202+
## Upgrading
203+
### Upgrade Testkube version
204+
1. Connect to the CI/CD cluster to have your kubeconfig pointing to it in your terminal.
205+
2. Have the latest version of the [TestKube CLI](https://docs.testkube.io/articles/install/cli) installed in your terminal.
206+
3. Export the latest helm chart values file, check out the [documentation](https://docs.testkube.io/articles/install/install-with-helm#installing).
207+
4. Pull all the images locally with tag mentioned in the values file and push it to MCR.
208+
5. Update `helm-testkube-values.yaml` to pull the updated images from MCR.
209+
210+
### Upgrade Golang Version
211+
1. The required Golang version in the `go.mod` files in the `ginkgo-e2e` directory will always need to be `<=` the Golang version of the TestKube Ginkgo runner.
212+
2. Check the Golang version of the TestKube Ginkgo runner in the [Dockerfile](https://github.com/kubeshop/testkube/blob/main/contrib/executor/ginkgo/build/agent/Dockerfile) of the TestKube repo.
213+
3. Update the version in the `go.mod` files.
214+
215+
216+
## Creating a New Test or Test Suite
217+
- Follow [Ginkgo](https://onsi.github.io/ginkgo/#getting-started) to write a new test/suite.
218+
- Any test added inside a test suite will automatically be picked up to run after merging to main.
219+
- Any test suite added should be included in [testkube-test-crs.yaml](./testkube/testkube-test-crs.yaml) that will be applied on the CI/CD clusters.
220+
- Any additional permissions needed for access to the API server should be added to [api-server-permissions.yaml](./testkube/api-server-permissions.yaml).

test/ginkgo-e2e/containerstatus/containerstatus_test.go

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,6 @@ var _ = DescribeTable("All processes are running",
9494
"MonAgentHost",
9595
"MonAgentManager",
9696
"MonAgentCore",
97-
"telegraf",
9897
},
9998
Label(utils.WindowsLabel),
10099
FlakeAttempts(3),

test/testkube/install-and-execute-testkube-tests.sh

Lines changed: 37 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ do
1313
esac
1414
done
1515

16+
cluster="$(kubectl config current-context)"
17+
echo "Current cluster: $cluster"
18+
1619
echo "Install testkube CLI"
1720
wget -qO - https://repo.testkube.io/key.pub | sudo apt-key add -
1821
echo "deb https://repo.testkube.io/linux linux main" | sudo tee -a /etc/apt/sources.list
@@ -28,8 +31,6 @@ echo "Install testkube CRIs"
2831
export AZURE_CLIENT_ID=$AzureClientId
2932
export AZURE_TENANT_ID=$AzureTenantId
3033
export WEBHOOK_URI=$TeamsWebhookUri
31-
envsubst < ./testkube-teams-integration.yaml > ./testkube-teams-integration-updated.yaml
32-
kubectl apply -f ./testkube-teams-integration-updated.yaml
3334
kubectl apply -f ./api-server-permissions.yaml
3435
envsubst < ./testkube-test-crs.yaml > ./testkube-test-crs-updated.yaml
3536
kubectl apply -f ./testkube-test-crs-updated.yaml
@@ -65,6 +66,40 @@ if [[ $(jq -r '.status' testkube-results.json) == "failed" ]]; then
6566
# Remove superfluous logs of everything before the last occurence of 'go downloading'.
6667
# The actual errors can be viewed from the ADO run, instead of needing to view the testkube dashboard.
6768
cat error.log | tac | awk '/go: downloading/ {exit} 1' | tac
69+
70+
result=$(cat error.log | tac | awk '/------------------------------/ {exit} 1' | tac | awk '{gsub(/\x1B\[[0-9;]*[mK]/, ""); print}')
71+
72+
payload=$(cat <<EOF
73+
{
74+
"@type": "MessageCard",
75+
"@context": "http://schema.org/extensions",
76+
"themeColor": "0076D7",
77+
"summary": "Test run failed",
78+
"sections": [{
79+
"activityTitle": "Test Execution Failed",
80+
"activitySubtitle": "CI Test Automation",
81+
"activityImage": "https://adaptivecards.io/content/cats/1.png",
82+
"facts": [{
83+
"name": "Cluster",
84+
"value": "**$cluster**"
85+
},{
86+
"name": "Test",
87+
"value": "**$testName**"
88+
}, {
89+
"name": "Execution Id",
90+
"value": "$id"
91+
}, {
92+
"name": "Result",
93+
"value": "$result"
94+
}],
95+
"markdown": true
96+
}]
97+
}
98+
EOF
99+
)
100+
101+
curl -X POST -H "Content-Type: application/json" -d "$payload" $WEBHOOK_URI
102+
68103
done
69104

70105
# Explicitly fail the ADO task since at least one test failed

test/testkube/testkube-teams-integration.yaml

Lines changed: 0 additions & 10 deletions
This file was deleted.

0 commit comments

Comments
 (0)