Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 67 additions & 135 deletions Runner/suites/Kernel/Baseport/Reboot_health_check/Readme.md
Original file line number Diff line number Diff line change
@@ -1,139 +1,71 @@
Overview

This script automates a full reboot validation and health check for any embedded Linux system.
It ensures that after each reboot, the system:

Boots correctly to shell

Key directories (/proc, /sys, /tmp, /dev) are available

Kernel version is accessible

Networking stack is functional


It supports auto-retry on failures, with configurable maximum retries.

No dependency on cron, systemd, Yocto specifics — purely portable.


---

Features

Automatic setup of a temporary boot hook

Reboot and post-boot health validations

Detailed logs with PASS/FAIL results

Auto-retry mechanism up to a configurable limit

Safe cleanup of temp files and hooks after success or failure

Color-coded outputs for easy reading

Lightweight and BusyBox compatible



---

Usage

Step 1: Copy the script to your device

scp reboot_health_check_autoretry.sh root@<device_ip>:/tmp/

Step 2: Make it executable

chmod +x /tmp/reboot_health_check_autoretry.sh

Step 3: Run the script

/tmp/reboot_health_check_autoretry.sh

The script will automatically:

Create a flag and self-copy to survive reboot

Setup a temporary /etc/init.d/ hook

Force reboot

On reboot, validate the system

Retry if needed



# Reboot_health_check

Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
SPDX-License-Identifier: BSD-3-Clause-Clear

## Overview
This test case validates the system's ability to reboot and recover correctly by:
- Automatically creating a systemd service to run the test after reboot
- Performing post-reboot health checks such as:
  - Root shell access
  - Filesystem availability
  - Kernel version
  - Network stack
- Retrying the test up to 3 times if any check fails
- Logging results and cleaning up after success or failure

## Usage
Instructions:
1. **Copy repo to Target Device**: Use `scp` to transfer the scripts from the host to the target device. The scripts should be copied to any directory on the target device.
2. **Verify Transfer**: Ensure that the repo has been successfully copied to the target device.
3. **Run Scripts**: Navigate to the directory where these files are copied on the target device and execute the scripts as needed.

Run the Reboot_health_check test using:
---

Log File

All outputs are stored in /tmp/reboot_test.log

Summarizes all individual tests and overall result



#### Quick Example
```sh
git clone <this-repo>
cd <this-repo>
scp -r common Runner user@target_device_ip:<Path in device>
ssh user@target_device_ip
cd <Path in device>/Runner && ./run-test.sh Reboot_health_check
```
---

Configuration

Modify these inside the script if needed:


---

Pass/Fail Criteria


## Prerequisites
1. Root access is required
2. systemctl, ifconfig or ip, and uname must be available
3. The system must support systemd and reboot functionality
---

Limitations

Requires basic /bin/sh shell (ash, bash, dash supported)

Needs writable /tmp/ and /etc/init.d/

Does not rely on systemd, cron, or external daemons



---

Cleanup

Script automatically:

Removes temporary boot hook

Deletes self-copy after successful completion

Cleans retry counters


You don't need to manually intervene.


---

Example Run Output

2025-04-26 19:45:20 [START] Reboot Health Test Started
2025-04-26 19:45:21 [STEP] Preparing system for reboot test...
2025-04-26 19:45:23 [INFO] System will reboot now to perform validation.
(reboots)

2025-04-26 19:46:10 [STEP] Starting post-reboot validation...
2025-04-26 19:46:11 [PASS] Boot flag detected. System reboot successful.
2025-04-26 19:46:12 [PASS] Shell is responsive.
2025-04-26 19:46:12 [PASS] Directory /proc exists.
2025-04-26 19:46:12 [PASS] Directory /sys exists.
2025-04-26 19:46:12 [PASS] Directory /tmp exists.
2025-04-26 19:46:12 [PASS] Directory /dev exists.
2025-04-26 19:46:12 [PASS] Kernel version: 6.6.65
2025-04-26 19:46:13 [PASS] Network stack active (ping localhost successful).
2025-04-26 19:46:13 [OVERALL PASS] Reboot + Health Check successful!

## Result Format
Test result will be saved in `Reboot_health_check.res ` as:

## Pass Criteria
All health checks pass successfully
System reboots and recovers correctly
Reboot_health_check PASS

## Fail Criteria
Any health check fails after 3 retries
Reboot_health_check FAIL
## Output
A .res file is generated in the same directory:
`Reboot_health_check PASS` OR `Reboot_health_check FAIL`

## Sample Log
```
[INFO] 1980-01-06 00:23:09 - ------------------- Starting Reboot_health_check Test ----------------------------
[INFO] 1980-01-06 00:23:09 - === Test Initialization ===
[INFO] 1980-01-06 00:23:09 - Creating systemd service and Rebooting...
[INFO] 1980-01-06 00:23:12 - System will reboot in 2 seconds...
sh-5.2#

[INFO] 1980-01-06 00:00:00 - ------------------- Starting Reboot_health_check Test ----------------------------
[INFO] 1980-01-06 00:00:00 - === Test Initialization ===
[INFO] 1980-01-06 00:00:00 - Post-reboot validation
[INFO] 1980-01-06 00:00:00 - Retry Count: 0
[PASS] 1980-01-06 00:00:00 - Reboot_health_check PASS
[INFO] 1980-01-06 00:00:00 - ------------------- Completed Reboot_health_check Test ----------------------------
sh-5.2#

```
114 changes: 80 additions & 34 deletions Runner/suites/Kernel/Baseport/Reboot_health_check/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

# Robustly find and source init_env
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
INIT_ENV=""
Expand Down Expand Up @@ -30,57 +29,104 @@ fi
. "$TOOLS/functestlib.sh"

TESTNAME="Reboot_health_check"
test_path=$(find_test_case_by_name "$TESTNAME")
cd "$test_path" || exit 1
# shellcheck disable=SC2034
res_file="./$TESTNAME.res"

log_info "-----------------------------------------------------------------------------------------"
log_info "-------------------Starting $TESTNAME Testcase----------------------------"
log_info "=== Test Initialization ==="
cd "$SCRIPT_DIR" || exit 1

# Directory for health check files
HEALTH_DIR="/var/reboot_health"
RETRY_FILE="$HEALTH_DIR/reboot_retry_count"
LOG_FILE="$SCRIPT_DIR/reboot_test.log"
RES_FILE="$SCRIPT_DIR/${TESTNAME}.res"
MARKER="$SCRIPT_DIR/reboot_marker"
RETRY_FILE="$SCRIPT_DIR/reboot_retry_count"
SERVICE_FILE="/etc/systemd/system/reboot-health.service"
MAX_RETRIES=3

# Make sure health directory exists
mkdir -p "$HEALTH_DIR"
log_info "-------------------- Starting $TESTNAME Test ----------------------------"
log_info "=== Test Initialization ==="

# Initialize retry count if not exist
# Initialize retry count
if [ ! -f "$RETRY_FILE" ]; then
echo "0" > "$RETRY_FILE"
fi

# Read current retry count
RETRY_COUNT=$(cat "$RETRY_FILE")

log_info "--------------------------------------------"
log_info "Boot Health Check Started - $(date)"
log_info "Current Retry Count: $RETRY_COUNT"
# Create systemd service on first run
if [ ! -f "$MARKER" ]; then
log_info "Creating systemd service and Rebooting..."

# Health Check: You can expand this check
if [ "$(whoami)" = "root" ]; then
log_pass "System booted successfully and root shell obtained."
log_info "Test Completed Successfully after $RETRY_COUNT retries."

# Optional: clean retry counter after success
echo "0" > "$RETRY_FILE"

cat <<EOF > "$SERVICE_FILE"
[Unit]
Description=Reboot Health Check Service
After=network.target

[Service]
Type=oneshot
ExecStart=$SCRIPT_DIR/run.sh
StandardOutput=append:${LOG_FILE}
StandardError=append:${LOG_FILE}
RemainAfterExit=true

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reexec
systemctl daemon-reload
systemctl enable reboot-health.service

touch "$MARKER"
log_info "System will reboot in 2 seconds..."
sleep 2
reboot
exit 0
fi

log_info "Post-reboot validation"
log_info "Retry Count: $RETRY_COUNT"

pass=true

if ! whoami | grep -q "root"; then
log_fail "Root shell not accessible"
pass=false
fi

for path in /proc /sys /dev /tmp; do
if [ ! -d "$path" ]; then
log_fail "Missing or inaccessible: $path"
pass=false
fi
done

if ! uname -a >/dev/null 2>&1; then
log_fail "Kernel version not available"
pass=false
fi

if ! ifconfig >/dev/null 2>&1 && ! ip a >/dev/null 2>&1; then
log_fail "Networking stack failed"
pass=false
fi

if $pass; then
log_pass "$TESTNAME PASS"
echo "$TESTNAME PASS" > "$RES_FILE"
echo "0" > "$RETRY_FILE"
else
log_fail "Root shell not available!"

log_fail "$TESTNAME FAIL"
echo "$TESTNAME FAIL" > "$RES_FILE"
RETRY_COUNT=$((RETRY_COUNT + 1))
echo "$RETRY_COUNT" > "$RETRY_FILE"

if [ "$RETRY_COUNT" -ge "$MAX_RETRIES" ]; then
log_error "[ERROR] Maximum retries ($MAX_RETRIES) reached. Stopping test."
log_error "Max retries ($MAX_RETRIES) reached. Stopping."
rm -f "$MARKER"
exit 1
else
log_info "Rebooting system for retry #$RETRY_COUNT..."
sync
log_info "Rebooting for retry #$RETRY_COUNT..."
sleep 2
reboot -f
exit 0
fi
fi

rm -f "$MARKER"
log_info "------------------- Completed $TESTNAME Test ----------------------------"
exit 0
Loading
Loading