Skip to content

Commit 23716a8

Browse files
committed
update deploy script and update samples alert readme
1 parent 6d043fe commit 23716a8

File tree

2 files changed

+16
-15
lines changed

2 files changed

+16
-15
lines changed

monitoring/bin/deploy_monitoring_cluster.sh

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -371,29 +371,21 @@ versionstring="$(get_helm_versionstring "$KUBE_PROM_STACK_CHART_VERSION")"
371371
log_debug "Installing Helm chart from artifact [$chart2install]"
372372

373373
# Alerts
374-
if [ -d "monitoring/alerting/rules" ]; then
375-
log_verbose "Creating Grafana alert rules ConfigMap"
374+
log_verbose "Creating Grafana alert rules ConfigMap"
375+
CUSTOM_ALERT_CONFIG_DIR="$USER_DIR/monitoring/alerting/"
376376

377-
# Start with required file
378-
CM_ARGS=(--from-file="monitoring/alerting/rules")
379-
380-
CUSTOM_ALERT_CONFIG_DIR="$USER_DIR/monitoring/alerting/"
381-
382-
# Add optional custom directory if it exists
383-
if [ -d "$CUSTOM_ALERT_CONFIG_DIR" ]; then
384-
log_debug "Including notifiers and additional alert rules from '$CUSTOM_ALERT_CONFIG_DIR'"
385-
CM_ARGS+=(--from-file="$CUSTOM_ALERT_CONFIG_DIR")
386-
else
387-
log_debug "No custom alert config directory found at '$CUSTOM_ALERT_CONFIG_DIR'. Skipping."
388-
fi
377+
# Add optional custom directory if it exists
378+
if [ -d "$CUSTOM_ALERT_CONFIG_DIR" ]; then
379+
log_debug "Including notifiers and additional alert rules from '$CUSTOM_ALERT_CONFIG_DIR'"
380+
CM_ARGS=(--from-file="$CUSTOM_ALERT_CONFIG_DIR")
389381

390382
# Run the kubectl command with all arguments
391383
kubectl create configmap grafana-alert-rules \
392384
"${CM_ARGS[@]}" \
393385
-n "$MON_NS" \
394386
--dry-run=client -o yaml | kubectl apply -f -
395387
else
396-
log_debug "No alert rules file found at 'monitoring/alerting/rules'"
388+
log_debug "No custom alert config directory found at '$CUSTOM_ALERT_CONFIG_DIR'. Skipping."
397389
fi
398390

399391
# shellcheck disable=SC2086

samples/alerts/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,15 @@ Adjust thresholds based on your environment size and requirements:
8383
- `platform/high-viya-api-latency.yaml`: > 1 second (95th percentile)
8484
- `database/crunchy-pgdata-usage-high.yaml` and `database/crunchy-backrest-repo.yaml`: > 50% full
8585
86+
#### 5. Verify Metric Availability
87+
Ensure the following metrics are available in your Prometheus instance:
88+
- CAS metrics: `cas_thread_count`, `cas_grid_uptime_seconds_total`
89+
- Database metrics: `sas_db_pool_connections`, `pg_stat_activity_count`, `pg_settings_max_connections`
90+
- RabbitMQ metrics: `rabbitmq_queue_messages_ready`, `rabbitmq_queue_messages_unacknowledged`
91+
- Kubernetes metrics: `kube_pod_container_status_restarts_total`, `kube_pod_container_status_ready`
92+
- HTTP metrics: `http_server_requests_duration_seconds_bucket`
93+
- SAS Job Launcher: `:sas_launcher_pod_status:` (recording rule)
94+
8695
### Alert Expression Format
8796

8897
Alert expressions in these samples use a multi-part approach for better compatibility with newer Grafana versions:

0 commit comments

Comments
 (0)