Skip to content

Commit 51e041d

Browse files
committed
Updated so the watching CloudWatch alert could handle the SNS topic being in another region.
1 parent 9b20b3e commit 51e041d

File tree

3 files changed

+115
-23
lines changed

3 files changed

+115
-23
lines changed

Monitoring/monitor-ontap-services/README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -101,12 +101,13 @@ To install the program using the CloudFormation template, you will need to do th
101101
|CheckInterval|The interval, in minutes, that the EventBridge schedule will trigger the Lambda function. The default is 15 minutes.|
102102
|CreateCloudWatchAlarm|Set to "true" if you want to create a CloudWatch alarm that will alert you if the Lambda function fails.|
103103
|CreateSecretsManagerEndpoint|Set to "true" if you want to create a Secrets Manager endpoint. **NOTE:** If an SecretsManager Endpoint already exist for the specified Subnet the endpoint creation will fail, causing the entire CloudFormation stack to fail. Please read the [Endpoints for AWS services](#endpoints-for-aws-services) for more information.|
104-
|CreateSNSEndpoint|Set to "true" if you want to create an SNS endpoint. **NOTE:** If an SNS Endpoint already exist for the specified Subnet the endpoint creation will fail, causing the entire CloudFormation stack to fail. Please read the [Endpoints for AWS services](#endpoints-for-aws-services) for more information.|
104+
|CreateLambdaEndpoint|Set to "true" if you want to create an Lambda endpoint. **NOTE:** If an Lambda Endpoint already exist for the specified Subnet the endpoint creation will fail, causing the entire CloudFormation stack to fail. Please read the [Endpoints for AWS services](#endpoints-for-aws-services) for more information.|
105105
|CreateCWEndpoint|Set to "true" if you want to create a CloudWatch endpoint. **NOTE:** If an CloudWatch Endpoint already exist for the specified Subnet the endpoint creation will fail, causing the entire CloudFormation stack to fail. Please read the [Endpoints for AWS services](#endpoints-for-aws-services) for more information.|
106106
|CreateS3Endpoint|Set to "true" if you want to create an S3 endpoint. **NOTE:** If an S3 Gateway Endpoint already exist for the specified VPC the endpoint creation will fail, causing the entire CloudFormation stack to fail. Note that this will be a "Gateway" type endpoint, since they are free to use. Please read the [Endpoints for AWS services](#endpoints-for-aws-services) for more information.|
107107
|RoutetableIds|The route table IDs to update to use the S3 endpoint. Since the S3 endpoint is of type `Gateway` route tables have to be updated to use it. This parameter is only needed if you are creating an S3 endpoint.|
108-
|VpcId|The ID of a VPC where the subnets provided above are located. This is only needed if you are creating an endpoint.|
109-
|EndpointSecurityGroupIds|The security group IDs that the endpoint will be attached to. The security group must allow traffic over TCP port 443 from the Lambda function. This is only needed if you are creating an SNS, CloudWatch or SecretsManager endpoint.|
108+
|VpcId|The ID of a VPC where the subnets provided above are located. Required if you are creating an endpoint, not needed otherwise.|
109+
|EndpointSecurityGroupIds|The security group IDs that the endpoint will be attached to. The security group must allow traffic over TCP port 443 from the Lambda function. This is required if you are creating an Lambda, CloudWatch or SecretsManager endpoint.|
110+
|watchdogRoleArn|The ARN of the role that the Lambda function that the Watchdog CloudWatch alert will use to send SNS alerts if something goes wrong with the monitoring Lambda function. The only required permission is to publish to the SNS topic listed above, although highly recommended that you also add the AWS managed "AWSLambdaBasicExecutionRole" policy that allows the Lambda function to create and write to a CloudWatch log stream so it can provide diagnostic output of something goes wrong. Only required if creating a CloudWatch alert and you want to provide your own role. If left blank a role will be created for you if needed.|
110111
|LambdaRoleArn|The ARN of the role that the Lambda function will use. This role must have the permissions listed in the [Create an AWS Role](#create-an-aws-role) section below. If left blank a role will be created for you.|
111112
|SchedulerRoleArn|The ARN of the role that the EventBridge schedule will use to trigger the Lambda function. It just needs the permission to invoke a Lambda function. If left blank a role will be created for you.|
112113

Monitoring/monitor-ontap-services/cloudformation.yaml

Lines changed: 107 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -19,14 +19,15 @@ Metadata:
1919
- checkInterval
2020
- createWatchdogAlarm
2121
- createSecretsManagerEndpoint
22-
- createSNSEndpoint
22+
- createLambdaEndpoint
2323
- createCloudWatchLogsEndpoint
2424
- createS3Endpoint
2525
- routeTableIds
2626
- vpcId
2727
- endpointSecurityGroupIds
2828
- LambdaRoleArn
2929
- SchedulerRoleArn
30+
- watchdogRoleArn
3031
- Label:
3132
default: "Alert Parameters"
3233
Parameters:
@@ -74,7 +75,7 @@ Parameters:
7475
Type: String
7576

7677
cloudWatchLogGroupArn:
77-
Description: "The ARN of the CloudWatch log group to send alerts to. If left blank, alerts will not be sent to CloudWatch. Note that the log group must already exist."
78+
Description: "The ARN of the CloudWatch log group to send alerts to. If left blank, alerts will not be sent to CloudWatch. Note that the log group must already exist. Also note that the ARN should end with ':*'."
7879
Type: String
7980
Default: ""
8081

@@ -98,14 +99,19 @@ Parameters:
9899
Default: "true"
99100
AllowedValues: ["true", "false"]
100101

102+
watchdogRoleArn:
103+
Description: "The ARN of the role to use for the Lambda function that will publish messages to the SNS topic if the monitoring function doesn't run properly. This is only needed if you are having the CloudWatch alarm created and if you want to provide an existing role, otherwise an appropriate one will be created for you."
104+
Type: String
105+
Default: ""
106+
101107
createSecretsManagerEndpoint:
102108
Description: "Create a Secrets Manager endpoint."
103109
Type: String
104110
Default: "false"
105111
AllowedValues: ["true", "false"]
106112

107-
createSNSEndpoint:
108-
Description: "Create an SNS endpoint."
113+
createLambdaEndpoint:
114+
Description: "Create an Lambda endpoint."
109115
Type: String
110116
Default: "false"
111117
AllowedValues: ["true", "false"]
@@ -265,14 +271,14 @@ Parameters:
265271

266272
Conditions:
267273
CreateSecretsManagerEndpoint: !Equals [!Ref createSecretsManagerEndpoint, "true"]
268-
CreateSNSEndpoint: !Equals [!Ref createSNSEndpoint, "true"]
274+
CreateLambdaEndpoint: !Equals [!Ref createLambdaEndpoint, "true"]
269275
CreateS3Endpoint: !Equals [!Ref createS3Endpoint, "true"]
270276
CreateCloudWatchLogsEndpoint: !Equals [!Ref createCloudWatchLogsEndpoint, "true"]
271277
CreateWatchdogAlarm: !Equals [!Ref createWatchdogAlarm, "true"]
272278
CreateLambdaRoleWithCW: !And [!Equals [!Ref LambdaRoleArn, ""], !Not [!Equals [!Ref cloudWatchLogGroupArn, ""]]]
273279
CreateLambdaRoleWithoutCW: !And [!Equals [!Ref LambdaRoleArn, ""], !Equals [!Ref cloudWatchLogGroupArn, ""]]
274280
CreateSchedulerRole: !Equals [!Ref SchedulerRoleArn, ""]
275-
IncludeCloudWatchPermissions: !Not [!Equals [!Ref cloudWatchLogGroupArn, ""]]
281+
CreateWatchdogRole: !Equals [!Ref watchdogRoleArn, ""]
276282

277283
Resources:
278284
SecretManagerEndpoint:
@@ -297,12 +303,12 @@ Resources:
297303
SubnetIds: !Ref subNetIds
298304
SecurityGroupIds: !Ref endpointSecurityGroupIds
299305

300-
SNSEndpoint:
306+
LambdaEndpoint:
301307
Type: AWS::EC2::VPCEndpoint
302-
Condition: CreateSNSEndpoint
308+
Condition: CreateLambdaEndpoint
303309
Properties:
304310
VpcId: !Ref vpcId
305-
ServiceName: !Sub "com.amazonaws.${AWS::Region}.sns"
311+
ServiceName: !Sub "com.amazonaws.${AWS::Region}.lambda"
306312
VpcEndpointType: 'Interface'
307313
PrivateDnsEnabled: true
308314
SubnetIds: !Ref subNetIds
@@ -316,8 +322,92 @@ Resources:
316322
ServiceName: !Sub "com.amazonaws.${AWS::Region}.s3"
317323
VpcEndpointType: 'Gateway'
318324
RouteTableIds: !Ref routeTableIds
319-
320-
watchDogAlarm:
325+
#
326+
# Allow the Watchdog Lambda function to publish to the SNS topic.
327+
LambdaRoleWatchdog:
328+
Type: "AWS::IAM::Role"
329+
Condition: CreateWatchdogRole
330+
Properties:
331+
RoleName: !Sub "mon-ontap-services-watchdog-${AWS::StackName}"
332+
AssumeRolePolicyDocument:
333+
Version: "2012-10-17"
334+
Statement:
335+
- Effect: "Allow"
336+
Principal:
337+
Service: "lambda.amazonaws.com"
338+
Action: "sts:AssumeRole"
339+
Policies:
340+
- PolicyName: "LambdaPolicyWatchdog"
341+
PolicyDocument:
342+
Version: "2012-10-17"
343+
Statement:
344+
- Effect: "Allow"
345+
Action:
346+
- "sns:Publish"
347+
Resource: !Ref snsTopicArn
348+
ManagedPolicyArns:
349+
- "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
350+
#
351+
# This allows the Watchdog CloudWatch alarm to invoke the Lambda function.
352+
resourceBasedPermission:
353+
Type: "AWS::Lambda::Permission"
354+
Condition: CreateWatchdogAlarm
355+
Properties:
356+
Action: "lambda:InvokeFunction"
357+
FunctionName: !Sub "monitor-ontap-services-watchdog-${AWS::StackName}"
358+
Principal: "lambda.alarms.cloudwatch.amazonaws.com"
359+
SourceArn: !GetAtt watchdogAlarm.Arn
360+
#
361+
# Use a Lambda function to publish to an SNS topic so it can reside in a different region.
362+
watchdogLambdaFunction:
363+
Type: "AWS::Lambda::Function"
364+
Condition: CreateWatchdogAlarm
365+
Properties:
366+
FunctionName: !Sub "monitor-ontap-services-watchdog-${AWS::StackName}"
367+
PackageType: "Zip"
368+
Runtime: "python3.12"
369+
Handler: "index.lambda_handler"
370+
Timeout: 10
371+
Role: !If [CreateWatchdogRole, !GetAtt LambdaRoleWatchdog.Arn, !Ref watchdogRoleArn]
372+
Environment:
373+
Variables:
374+
snsTopicArn: !Ref snsTopicArn
375+
Code:
376+
ZipFile: |
377+
import boto3
378+
import os
379+
380+
def lambda_handler(event, context):
381+
snsTopicArn = os.environ.get('snsTopicArn')
382+
if snsTopicArn is not None:
383+
region = snsTopicArn.split(":")[3]
384+
snsClient = boto3.client('sns', region_name=region)
385+
#
386+
# This is for future developement when the monitor-ontap-services
387+
# Lambda function will be able to send messages to the SNS topic.
388+
cmd = event.get("cmd")
389+
#
390+
# If the cmd is None, then assume a CloudWatch alarm triggered this function.
391+
if cmd is None:
392+
message = f'Error! Lambda function {event["alarmData"]["alarmName"].replace("-watchdog-", "")} failed to execute properly.'
393+
snsClient.publish(
394+
TopicArn = snsTopicArn,
395+
Subject = 'Error! Monitoring ONTAP services has failed to execute',
396+
Message = message
397+
)
398+
elif cmd == "sendSns":
399+
message = event.get("message")
400+
subject = event.get("subject")
401+
snsClient.publish(
402+
TopicArn = snsTopicArn,
403+
Subject = subject,
404+
Message = message
405+
)
406+
#
407+
# This is the CloudWatch alarm that will trigger when the monitor-ontap-services
408+
# Lambda function fails to run successfully. It will invoke the watchdogLambdaFunction
409+
# to send a message to the SNS topic.
410+
watchdogAlarm:
321411
Type: "AWS::CloudWatch::Alarm"
322412
Condition: CreateWatchdogAlarm
323413
Properties:
@@ -335,7 +425,7 @@ Resources:
335425
Threshold: 0.5
336426
ComparisonOperator: "GreaterThanThreshold"
337427
AlarmActions:
338-
- !Ref snsTopicArn
428+
- !GetAtt watchdogLambdaFunction.Arn
339429

340430
LambdaRoleWithoutCW:
341431
Type: "AWS::IAM::Role"
@@ -368,8 +458,8 @@ Resources:
368458
Resource:
369459
- !Ref secretArn
370460
- !Ref snsTopicArn
371-
- !Sub "arn:aws:s3:::{s3BucketName}"
372-
- !Sub "arn:aws:s3:::{s3BucketName}/*"
461+
- !Sub "arn:aws:s3:::${s3BucketName}"
462+
- !Sub "arn:aws:s3:::${s3BucketName}/*"
373463

374464
LambdaRoleWithCW:
375465
Type: "AWS::IAM::Role"
@@ -477,6 +567,7 @@ Resources:
477567
secretUsernameKey: !Ref secretUsernameKey
478568
secretPasswordKey: !Ref secretPasswordKey
479569
snsTopicArn: !Ref snsTopicArn
570+
sendSnsFunctionName: !Sub "monitor-ontap-services-watchdog-${AWS::StackName}"
480571
cloudWatchLogGroupArn: !Ref cloudWatchLogGroupArn
481572

482573
initialVersionChangeAlert: !Ref versionChangeAlert
@@ -521,8 +612,8 @@ Resources:
521612
# "matching conditions." It is intended to be run as a Lambda function, but
522613
# can be run as a standalone program.
523614
#
524-
# Version: v2.14
525-
# Date: 2025-04-29-12:53:45
615+
# Version: v2.15
616+
# Date: 2025-05-19-08:28:59
526617
################################################################################
527618
528619
import json

Monitoring/monitor-ontap-services/updateMonOntapServiceCFTemplate

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ tmpfile2=$(mktemp /tmp/tmpfile2.XXXXXX)
1313
trap "rm -f $tmpfile1 $tmpfile2" exit
1414
#
1515
# First get the monitoring code out of the CF template.
16-
sed -e '1,/ZipFile/d' cloudformation.yaml > cloudformation.yaml.tmp
16+
sed -e '1,\(#!/bin/python3(d' cloudformation.yaml > cloudformation.yaml.tmp
1717
#
1818
# Now get the Date and Version lines out of both files.
1919
egrep -v '^ # Date:|^ # Version' cloudformation.yaml.tmp > $tmpfile1
20-
egrep -v '^# Date:|^# Version:' $file > $tmpfile2
20+
sed -e 1d $file | egrep -v '^# Date:|^# Version:' > $tmpfile2
2121

2222
if diff -w $tmpfile1 $tmpfile2 > /dev/null; then
2323
echo "No changes to the monitor code."
@@ -36,8 +36,8 @@ fi
3636
version="v${majorVersionNum}.${minorVersionNum}"
3737
#
3838
# Strip out the monitoring code.
39-
sed -e '/ZipFile/,$d' cloudformation.yaml > cloudformation.yaml.tmp
40-
echo " ZipFile: |" >> cloudformation.yaml.tmp
39+
sed -e '\(#!/bin/python3(,$d' cloudformation.yaml > cloudformation.yaml.tmp
40+
#echo " ZipFile: |" >> cloudformation.yaml.tmp
4141
#
4242
# Add the monitoring code to the CF template while updating the version and date.
4343
cat "$file" | sed -e 's/^/ /' -e "s/%%VERSION%%/${version}/" -e "s/%%DATE%%/$(date +%Y-%m-%d-%H:%M:%S)/" >> cloudformation.yaml.tmp

0 commit comments

Comments
 (0)