Skip to content

Commit e10a1c9

Browse files
authored
Merge pull request #61653 from ShaunaDiaz/OSDOCS-5631
OSDOCS-5631: Greenboot app health check script tutorial
2 parents 99426be + 3a42a7d commit e10a1c9

6 files changed

+231
-44
lines changed

_topic_maps/_topic_map_ms.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,8 @@ Topics:
151151
File: microshift-applications
152152
- Name: Operators
153153
File: microshift-operators
154+
- Name: Greenboot workload health check scripts
155+
File: microshift-greenboot-workload-scripts
154156
---
155157
Name: Troubleshooting
156158
Dir: microshift_troubleshooting
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
:_content-type: ASSEMBLY
2+
[id="microshift-greenboot-workload-scripts"]
3+
= Greenboot workload health check scripts
4+
include::_attributes/attributes-microshift.adoc[]
5+
:context: microshift-greenboot-workload-scripts
6+
7+
toc::[]
8+
9+
Greenboot health check scripts are helpful on edge devices where direct serviceability is either limited or non-existent. If you installed the `microshift-greenboot` RPM package, you can also create health check scripts assess the health of your workloads and applications. These additional health check scripts are useful components of software problem checks and automatic system rollbacks.
10+
11+
A {product-title} health check script is included in the `microshift-greenboot` RPM. You can also create your own health check scripts based on the workloads you are running. For example, you can write one that verifies that a service has started.
12+
13+
include::modules/microshift-greenboot-how-workload-health-check-scripts-work.adoc[leveloffset=+1]
14+
15+
include::modules/microshift-greenboot-included-health-checks.adoc[leveloffset=+1]
16+
17+
include::modules/microshift-greenboot-create-health-check-script.adoc[leveloffset=+1]
18+
19+
include::modules/microshift-greenboot-testing-workload-script.adoc[leveloffset=+1]
20+
21+
[id="additional-resources_microshift-greenboot-workload-scripts"]
22+
[role="_additional-resources"]
23+
.Additional resources
24+
* xref:../microshift_install/microshift-greenboot.adoc#microshift-greenboot[The greenboot health check]
25+
* xref:../microshift_running_apps/microshift-applications.adoc#microshift-manifests-example_applications-microshift[Auto applying manifests]
Lines changed: 101 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,114 @@
1-
// Module included in the following assemblies:
1+
//Updated title and ID:
2+
//Module included in the following assemblies:
23
//
3-
// * microshift_running applications/microshift-greenboot.adoc
4+
//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc
45

5-
:_content-type: PROCEDURE
6-
[id="microshift-greenboot-create-health-check-script_{context}"]
7-
= Creating a health check script
6+
:_content-type: CONCEPT
7+
[id="microshift-greenboot-app-health-check-script_{context}"]
8+
= How to create a health check script for your application
89

9-
You can create a health check script for installed workloads by placing them in the `/etc/greenboot/check/required.d` directory. The following procedure provides an example of installing the busybox application and creating a health check script for busybox. You can use this example as a general guide for creating health check scripts for your applications.
10+
You can create workload or application health check scripts in the text editor of your choice using the example in this documentation. Save the scripts in the `/etc/greenboot/check/required.d` directory. When a script in the `/etc/greenboot/check/required.d` directory exits with an error, greenboot triggers a reboot in an attempt to heal the system.
1011

11-
.Prerequisite
12+
[NOTE]
13+
====
14+
Any script in the `/etc/greenboot/check/required.d` directory triggers a reboot if it exits with an error.
15+
====
1216

13-
* You have installed a workload. For this example, the busybox application is used as a workload. The "Additional resources" section that follows this procedure has a link to instructions on deploying workloads using manifests.
17+
If your health check logic requires any post-check steps, you can also create additional scripts and save them in the relevant greenboot directories. For example:
1418

15-
.Procedure
19+
* You can also place shell scripts you want to run after a boot has been declared successful in `/etc/greenboot/green.d`.
20+
* You can place shell scripts you want to run after a boot has been declared failed in `/etc/greenboot/red.d`. For example, if you have steps to heal the system before restarting, you can create scripts for your use case and place them in the `/etc/greenboot/red.d` directory.
1621
17-
. To create a health check script, run the following command:
18-
+
19-
[source, terminal]
20-
----
21-
$ SCRIPT_FILE=/etc/greenboot/check/required.d/50_busybox_running_check.sh
22-
sudo curl -s https://raw.githubusercontent.com/openshift/microshift/3b7f6025cd77bd1bf827416fd026783ead82b7c8/docs/config/busybox_running_check.sh \
23-
-o ${SCRIPT_FILE} && echo SUCCESS || echo ERROR
24-
sudo chmod 755 ${SCRIPT_FILE}
25-
----
26-
+
27-
In this example, the script verifies that busybox is running as expected. You can replace `/etc/greenboot/check/required.d/50_busybox_running_check.sh` with your own workload details.
28-
+
29-
[NOTE]
22+
[id="microshift-greenboot-about-workload-health-check-script-example_{context}"]
23+
== About the workload health check script example
24+
25+
The following example uses the {product-title} health check script as a template. You can use this example with the provided libraries as a guide for creating basic health check scripts for your applications.
26+
27+
[id="microshift-greenboot-app-health-check-basic-prereqs_{context}"]
28+
=== Basic prerequisites for creating a health check script
29+
30+
* The workload must be installed.
31+
* You must have root access.
32+
33+
[id="microshift-greenboot-app-health-check-ex-reqs_{context}"]
34+
=== Example and functional requirements
35+
36+
You can start with the following example health check script. Modify it for your use case. In your workload health check script, you must complete the following minimum steps:
37+
38+
* Set the environment variables.
39+
* Define the user workload namespaces.
40+
* List the expected pod count.
41+
42+
[IMPORTANT]
3043
====
31-
In this example, the {product-title} core service health checks run before the user workload health checks.
44+
Choose a name prefix for your application that ensures it runs after the `40_microshift_running_check.sh` script, which implements the {product-title} health check procedure for its core services.
3245
====
3346

34-
. To test that your script is running as expected:
35-
36-
.. Restart the system.
47+
.Example workload health check script
3748

38-
.. Once the system has restarted, run the following command:
39-
+
40-
[source, terminal]
41-
----
42-
$ sudo journalctl -o cat -u greenboot-healthcheck.service
49+
[source, bash]
4350
----
44-
+
45-
.Example output for the busybox health check script
46-
+
47-
[source, terminal]
48-
----
49-
...
50-
...
51-
STARTED
52-
Waiting 300s for pod image(s) from the 'busybox' namespace to be downloaded
53-
Waiting 300s for 1 pod(s) from the 'busybox' namespace to be in 'Ready' state
54-
Checking pod restart count in the 'busybox' namespace
55-
FINISHED
56-
Script '50_busybox_running_check.sh' SUCCESS
51+
# #!/bin/bash
52+
set -e
53+
54+
SCRIPT_NAME=$(basename $0)
55+
PODS_NS_LIST=(<user_workload_namespace1> <user_workload_namespace2>)
56+
PODS_CT_LIST=(<user_workload_namespace1_pod_count> <user_workload_namespace2_pod_count>)
57+
# Update these two lines with at least one namespace and the pod counts that are specific to your workloads. Use the kubernetes <namespace> where your workload is deployed.
58+
59+
# Set greenboot to read and execute the workload health check functions library.
60+
source /usr/share/microshift/functions/greenboot.sh
61+
62+
# Set the exit handler to log the exit status.
63+
trap 'script_exit' EXIT
64+
65+
# Set the script exit handler to log a `FAILURE` or `FINISHED` message depending on the exit status of the last command.
66+
# args: None
67+
# return: None
68+
function script_exit() {
69+
[ "$?" -ne 0 ] && status=FAILURE || status=FINISHED
70+
echo $status
71+
}
72+
73+
# Set the system to automatically stop the script if the user running it is not 'root'.
74+
if [ $(id -u) -ne 0 ] ; then
75+
echo "The '${SCRIPT_NAME}' script must be run with the 'root' user privileges"
76+
exit 1
77+
fi
78+
79+
echo "STARTED"
80+
81+
# Set the script to stop without reporting an error if the MicroShift service is not running.
82+
if [ $(systemctl is-enabled microshift.service 2>/dev/null) != "enabled" ] ; then
83+
echo "MicroShift service is not enabled. Exiting..."
84+
exit 0
85+
fi
86+
87+
# Set the wait timeout for the current check based on the boot counter.
88+
WAIT_TIMEOUT_SECS=$(get_wait_timeout)
89+
90+
# Set the script to wait for the pod images to be downloaded.
91+
for i in ${!PODS_NS_LIST[@]}; do
92+
CHECK_PODS_NS=${PODS_NS_LIST[$i]}
93+
94+
echo "Waiting ${WAIT_TIMEOUT_SECS}s for pod image(s) from the ${CHECK_PODS_NS} namespace to be downloaded"
95+
wait_for ${WAIT_TIMEOUT_SECS} namespace_images_downloaded
96+
done
97+
98+
# Set the script to wait for pods to enter ready state.
99+
for i in ${!PODS_NS_LIST[@]}; do
100+
CHECK_PODS_NS=${PODS_NS_LIST[$i]}
101+
CHECK_PODS_CT=${PODS_CT_LIST[$i]}
102+
103+
echo "Waiting ${WAIT_TIMEOUT_SECS}s for ${CHECK_PODS_CT} pod(s) from the ${CHECK_PODS_NS} namespace to be in 'Ready' state"
104+
wait_for ${WAIT_TIMEOUT_SECS} namespace_pods_ready
105+
done
106+
107+
# Verify that pods are not restarting by running, which could indicate a crash loop.
108+
for i in ${!PODS_NS_LIST[@]}; do
109+
CHECK_PODS_NS=${PODS_NS_LIST[$i]}
110+
111+
echo "Checking pod restart count in the ${CHECK_PODS_NS} namespace"
112+
namespace_pods_not_restarting ${CHECK_PODS_NS}
113+
done
57114
----
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
//Module included in the following assemblies:
2+
//
3+
//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="microshift-greenboot-how-workload-health-check-scripts-work_{context}"]
7+
= How workload health check scripts work
8+
9+
The workload or application health check script described in this tutorial uses the {product-title} health check functions that are available in the `/usr/share/microshift/functions/greenboot.sh` file. This enables you to reuse procedures already implemented for the {product-title} core services.
10+
11+
The script starts by running checks that the basic functions of the workload are operating as expected. To run the script successfully:
12+
13+
* Execute the script from a root user account.
14+
* Enable the {product-title} service.
15+
16+
The health check performs the following actions:
17+
18+
* Gets a wait timeout of the current boot cycle for the `wait_for` function.
19+
* Calls the `namespace_images_downloaded` function to wait until pod images are available.
20+
* Calls the `namespace_pods_ready` function to wait until pods are ready.
21+
* Calls the `namespace_pods_not_restarting` function to verify pods are not restarting.
22+
23+
[NOTE]
24+
====
25+
Restarting pods can indicate a crash loop.
26+
====
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
//Module included in the following assemblies:
2+
//
3+
//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="microshift-greenboot-included-health-checks_{context}"]
7+
= Included greenboot health checks
8+
9+
Health check scripts are available in `/usr/lib/greenboot/check`, a read-only directory in RPM-OSTree systems. The following health checks are included with the `greenboot-default-health-checks` framework.
10+
11+
* Check if repository URLs are still DNS solvable:
12+
+
13+
This script is under `/usr/lib/greenboot/check/required.d/01_repository_dns_check.sh` and ensures that DNS queries to repository URLs are still available.
14+
15+
* Check if update platforms are still reachable:
16+
+
17+
This script is under `/usr/lib/greenboot/check/wanted.d/01_update_platform_check.sh` and tries to connect and get a 2XX or 3XX HTTP code from the update platforms defined in `/etc/ostree/remotes.d`.
18+
19+
* Check if the current boot has been triggered by the hardware watchdog:
20+
+
21+
This script is under `/usr/lib/greenboot/check/required.d/02_watchdog.sh` and checks whether the current boot has been watchdog-triggered or not.
22+
23+
** If the watchdog-triggered reboot occurs within the grace period, the current boot is marked as red. Greenboot does not trigger a rollback to the previous deployment.
24+
** If the watchdog-triggered reboot occurs after the grace period, the current boot is not marked as red. Greenboot does not trigger a rollback to the previous deployment.
25+
** A 24-hour grace period is enabled by default. This grace period can be either disabled by modifying `GREENBOOT_WATCHDOG_CHECK_ENABLED` in `/etc/greenboot/greenboot.conf to false`, or configured by changing the `GREENBOOT_WATCHDOG_GRACE_PERIOD=number_of_hours` variable value in `/etc/greenboot/greenboot.conf`.
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
//Module included in the following assemblies:
2+
//
3+
//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="microshift-greenboot-test-workload-health-check-script_{context}"]
7+
= Testing a workload health check script
8+
9+
.Prerequisites
10+
11+
* You have root access.
12+
* You have installed a workload.
13+
* You have created a health check script for the workload.
14+
* The {product-title} service is enabled.
15+
16+
.Procedure
17+
18+
. To test that greenboot is running a health check script file, reboot the host by running the following command:
19+
+
20+
[source, terminal]
21+
----
22+
$ sudo reboot
23+
----
24+
25+
. Examine the output of greenboot health checks by running the following command:
26+
+
27+
[source, terminal]
28+
----
29+
$ sudo journalctl -o cat -u greenboot-healthcheck.service
30+
----
31+
+
32+
[NOTE]
33+
====
34+
{product-title} core service health checks run before the workload health checks.
35+
====
36+
+
37+
.Example output
38+
39+
[source, terminal]
40+
----
41+
GRUB boot variables:
42+
boot_success=0
43+
boot_indeterminate=0
44+
Greenboot variables:
45+
GREENBOOT_WATCHDOG_CHECK_ENABLED=true
46+
...
47+
...
48+
FINISHED
49+
Script '40_microshift_running_check.sh' SUCCESS
50+
Running Wanted Health Check Scripts...
51+
Finished greenboot Health Checks Runner.
52+
----

0 commit comments

Comments
 (0)