Skip to content

Commit f0c3d02

Browse files
committed
Update references to the forms-runner ECS task for queue alerts
Solid Queue now runs in the forms-runner-queue-worker ECS task. Update alarm descriptions to reflect this change.
1 parent d2b8ba2 commit f0c3d02

6 files changed

+17
-16
lines changed

infra/deployments/forms/health/alerts/delete-submissions-job-not-run.tf

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
resource "aws_cloudwatch_metric_alarm" "delete_submissions_job_not_run" {
22
alarm_name = "${var.environment}-delete-submissions-job-not-run"
33
alarm_description = <<EOF
4-
The forms-runner job to delete submissions has not in the ${var.environment} environment in the past 2 hours. It is
5-
expected that the job is run every hour. This job is run by Solid Queue, which is started in the forms-runner ECS
6-
task.
4+
The forms-runner job to delete submissions has not run in the ${var.environment} environment in the past 2 hours. It
5+
is expected that the job is run every hour. This job is run by Solid Queue, which is started in the
6+
forms-runner-queue-worker ECS task.
77
88
NEXT STEPS:
99
1. Check the Splunk logs and Sentry for any errors running the job.
10-
2. Restart the forms-runner ECS tasks and check whether the job starts running.
10+
2. Restart the forms-runner-queue-worker ECS tasks and check whether the job starts running.
1111
1212
EOF
1313
comparison_operator = "LessThanOrEqualToThreshold"

infra/deployments/forms/health/alerts/receive-submission-bounces-and-complaints-job-not-run.tf

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
resource "aws_cloudwatch_metric_alarm" "receive_submission_bounces_and_complaints_job_not_run" {
22
alarm_name = "${var.environment}-receive-submission-bounces-and-complaints-job-not-run"
33
alarm_description = <<EOF
4-
The forms-runner job to receive SQS notifications about bounces and complaints has not in the ${var.environment} environment in
5-
the past 30 minutes. It is expected that the job is run every 10 minutes. This job is run by Solid Queue, which is
6-
started in the forms-runner ECS task.
4+
The forms-runner job to receive SQS notifications about bounces and complaints has not run in the ${var.environment}
5+
environment in the past 30 minutes. It is expected that the job is run every 10 minutes. This job is run by Solid
6+
Queue, which is started in the forms-runner-queue-worker ECS task.
77
88
NEXT STEPS:
99
1. Check the Splunk logs and Sentry for any errors running the job.
10-
2. Restart the forms-runner ECS tasks and check whether the job starts running.
10+
2. Restart the forms-runner-queue-worker ECS tasks and check whether the job starts running.
1111
1212
EOF
1313
comparison_operator = "LessThanOrEqualToThreshold"

infra/deployments/forms/health/alerts/receive-submission-deliveries-job-not-run.tf

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
resource "aws_cloudwatch_metric_alarm" "receive_submission_deliveries_job_not_run" {
22
alarm_name = "${var.environment}-receive-submission-deliveries-job-not-run"
33
alarm_description = <<EOF
4-
The forms-runner job to receive SQS notifications about submission deliveries has not in the ${var.environment}
4+
The forms-runner job to receive SQS notifications about submission deliveries has not run in the ${var.environment}
55
environment in the past 30 minutes. It is expected that the job is run every 10 minutes. This job is run by Solid
6-
Queue, which is started in the forms-runner ECS task.
6+
Queue, which is started in the forms-runner-queue-worker ECS task.
77
88
NEXT STEPS:
99
1. Check the Splunk logs and Sentry for any errors running the job.
10-
2. Restart the forms-runner ECS tasks and check whether the job starts running.
10+
2. Restart the forms-runner-queue-worker ECS tasks and check whether the job starts running.
1111
1212
EOF
1313
comparison_operator = "LessThanOrEqualToThreshold"

infra/deployments/forms/health/alerts/schedule-daily-batch-deliveries-job-not-run.tf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@ resource "aws_cloudwatch_metric_alarm" "schedule_daily_batch_deliveries_job_not_
33
alarm_description = <<EOF
44
The forms-runner job to schedule daily batch submission deliveries has not run in the ${var.environment} environment
55
in the past 25 hours. It is expected that the job is run every day. This job is run by Solid Queue, which is started
6-
in the forms-runner ECS task.
6+
in the forms-runner-queue-worker ECS task.
77
88
NEXT STEPS:
99
1. Check the Splunk logs and Sentry for any errors running the job.
10-
2. Restart the forms-runner ECS tasks and check whether the job starts running.
10+
2. Restart the forms-runner-queue-worker ECS tasks and check whether the job starts running.
1111
1212
EOF
1313
comparison_operator = "LessThanOrEqualToThreshold"

infra/deployments/forms/health/alerts/send-submission-job-failures.tf

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@ resource "aws_cloudwatch_metric_alarm" "send_submission_job_failures" {
22
alarm_name = "${var.environment}-send-submission-job-failures"
33
alarm_description = <<EOF
44
The forms-runner job to send submissions has failed more than 10 times in the last 15 minutes in the
5-
${var.environment} environment. This job is run by Solid Queue, which is started in the forms-runner ECS task.
5+
${var.environment} environment. This job is run by Solid Queue, which is started in the forms-runner-queue-worker
6+
ECS task.
67
78
NEXT STEPS:
8-
1. Check whether there are any forms-runner errors in Sentry related to the SendSubmissionJob. We retry to job
9+
1. Check whether there are any forms-runner errors in Sentry related to the SendSubmissionJob. We retry the job
910
automatically if there are errors calling AWS, but for all other errors we immediately send the error to Sentry and
1011
the job is not scheduled for retry.
1112

infra/deployments/forms/health/alerts/submission-time-to-send.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ resource "aws_cloudwatch_metric_alarm" "submission_time_to_send" {
44
The average time to send a submission from the time it was scheduled to be sent is greater than 1 minute in
55
the ${var.environment} environment. This suggests that we are unable to keep up with demand for the number of
66
submissions we need to process. The job to send submission emails is run by Solid Queue, which is started in the
7-
forms-runner ECS task.
7+
forms-runner-queue-worker ECS task.
88
99
NEXT STEPS:
1010
1. Search in Splunk for "event=form_submission_email_sent" to see the rate of submission emails being sent and the

0 commit comments

Comments
 (0)