Skip to content

Commit 4ea80b7

Browse files
committed
Initial Version
1 parent 6438b70 commit 4ea80b7

File tree

5 files changed

+183
-29
lines changed

5 files changed

+183
-29
lines changed

Monitoring/monitor_fsxn_with_harvest_on_ec2/README.md

Lines changed: 72 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Deploy NetApp Harvest on EC2
22

3-
Harvest installation for monitoring Amazon FSxN using Promethues and Grafana stack, integrating AWS Secret Manager for FSxN credentials.
3+
Harvest installation for monitoring Amazon FSxN using Prometheus and Grafana stack, integrating AWS Secret Manager for FSxN credentials.
44

55
## Introduction
66

@@ -11,22 +11,68 @@ Harvest installation will result in the following:
1111
* Collecting metrics about your FSxNs and adding existing Grafana dashboards for better visualization.
1212

1313
### Prerequisites
14-
* A FSx for ONTAP running in the same VPC.
15-
* If not running an AWS based Linux, ensure that the `aws` command has been instealled and configured.
14+
* A FSx for ONTAP file system running in the same VPC as the EC2 instance.
15+
* If not running an AWS based Linux, ensure that the `aws` command has been installed and configured.
1616

1717
## Installation Steps
1818

1919
### 1. Create AWS Secret Manager with Username and Password for each FSxN
20+
Since this solution uses an AWS Secrets Manager secret to authenticate with the FSx for ONTAP file system
21+
you will need to create a secret for each FSxN you want to monitor. You can use the following command to create a secret:
2022

2123
```sh
2224
aws secretsmanager create-secret --name <YOUR-SECRET-NAME> --secret-string '{"username":"fsxadmin","password":"<YOUR-PASSWORD>"}'
2325
```
2426

25-
### 2. Create Instance Profile with Permission to AWS Secret Manager and cloudwatch metrics
27+
### 2. Create Instance Profile with Permission to AWS Secret Manager and CloudWatch metrics
2628

2729
#### 2.1. Create Policy with Permissions to AWS Secret Manager
2830

29-
Edit the harvest-policy.json file found in this repo with the ARN of the AWS Secret Manager secret created above.
31+
Edit the harvest-policy.json file found in this repo with the ARN of the AWS Secret Manager secrets created above.
32+
If you only have one FSxN and therefore only one secret, remove the comma after the one secret ARN (i.e. the last
33+
entry should not have a comma after it).
34+
35+
```
36+
{
37+
"Statement": [
38+
{
39+
"Effect": "Allow",
40+
"Action": [
41+
"secretsmanager:GetSecretValue",
42+
"secretsmanager:DescribeSecret",
43+
"secretsmanager:ListSecrets"
44+
],
45+
"Resource": [
46+
"<your_secret_1_arn>",
47+
"<your_secret_2_arn>"
48+
]
49+
},
50+
{
51+
"Effect": "Allow",
52+
"Action": [
53+
"tag:GetResources",
54+
"cloudwatch:GetMetricData",
55+
"cloudwatch:GetMetricStatistics",
56+
"cloudwatch:ListMetrics",
57+
"apigateway:GET",
58+
"aps:ListWorkspaces",
59+
"autoscaling:DescribeAutoScalingGroups",
60+
"dms:DescribeReplicationInstances",
61+
"dms:DescribeReplicationTasks",
62+
"ec2:DescribeTransitGatewayAttachments",
63+
"ec2:DescribeSpotFleetRequests",
64+
"shield:ListProtections",
65+
"storagegateway:ListGateways",
66+
"storagegateway:ListTagsForResource",
67+
"iam:ListAccountAliases"
68+
],
69+
"Resource": [
70+
"*"
71+
]
72+
}
73+
],
74+
"Version": "2012-10-17"
75+
}
3076
3177
```sh
3278
POLICY_ARN=$(aws iam create-policy --policy-name harvest-policy --policy-document file://harvest-policy.json --query Policy.Arn --output text)
@@ -45,20 +91,20 @@ Note that the `trust-policy.json` file can be found in this repo.
4591

4692
### 3. Create EC2 Instance
4793

48-
We recommend using a `t2.xlarge` instance type with 20GB disk and attaching the instance profile.
94+
We recommend using a `t2.xlarge` or larger instance type with 20GB disk.
4995

50-
If you already have an ec2 instance, you can use the following command to attach the instance profile:
96+
Once you have created your ec2 instance, you can use the following command to attach the instance profile:
5197

5298
```sh
5399
aws ec2 associate-iam-instance-profile --instance-id <INSTANCE-ID> --iam-instance-profile Arn=<Instance-Profile-ARN>,Name=HarvestProfile
54100
```
55101
You should get the instance profile ARN from step 2.2 above.
56102

57-
If your exiting ec2 instance already had an instance profile, then simply add the policy create in step 2.2 above.
103+
If your exiting ec2 instance already had an instance profile, then simply add the policy create in step 2.2 above to its instance profile role.
58104

59105
### 4. Install Docker and Docker Compose
60106

61-
Use the following commands if you are running an Red Hat based Linux:
107+
To install Docker use the following commands if you are running an Red Hat based Linux:
62108
```sh
63109
sudo yum install docker
64110
sudo curl -L https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-compose-plugin-2.6.0-3.el7.x86_64.rpm -o ./compose-plugin.rpm
@@ -75,12 +121,6 @@ sudo docker run hello-world
75121

76122
You should get output similar to the following:
77123
```
78-
Unable to find image 'hello-world:latest' locally
79-
latest: Pulling from library/hello-world
80-
e6590344b1a5: Pull complete
81-
Digest: sha256:bfbb0cc14f13f9ed1ae86abc2b9f11181dc50d779807ed3a3c5e55a6936dbdd5
82-
Status: Downloaded newer image for hello-world:latest
83-
84124
Hello from Docker!
85125
This message shows that your installation appears to be working correctly.
86126
@@ -104,11 +144,12 @@ For more examples and ideas, visit:
104144
```
105145
### 5. Install Harvest on EC2
106146

107-
To install Harvest on your EC2 instance following the following steps:
147+
Preform the following steps to install Harvest on your EC2 instance:
108148

109149
#### 5.1. Generate Harvest Configuration File
110150

111-
Create `harvest.yml` file with your cluster details, below is an example with annotated comments. Modify as needed for your scenario:
151+
Modify the `harvest.yml` found in this repo with your clusters details. You mostly should just have to change the `<FSxN_ip_X>` to the IP of your FSxN.
152+
Add as many pollers as you need to monitor all your FSxNs. There should be an AWS Secrets Manager secret for each FSxN.
112153

113154
```yaml
114155
Exporters:
@@ -162,20 +203,22 @@ docker run --rm \
162203
--output harvest-compose.yml
163204
```
164205

165-
:warning:**NOTE** Ignore the command that it outputs used to start Harvest.
206+
:warning: Ignore the command that it outputs that it says will start the cluster.
166207

167208
#### 5.3. Replace Harvest images in the harvest-compose.yml:
168209

169-
Replace the Harvest image that supports using AWS Secret Manager for FSxN credentials:
210+
Replace the Harvest image with one that supports using AWS Secret Manager for FSxN credentials:
170211

171212
```yaml
172213
sed -i 's|ghcr.io/netapp/harvest:latest|ghcr.io/tlvdevops/harvest-fsx:latest|g' harvest-compose.yml
173214
```
174215

175216
#### 5.4. Add AWS Secret Manager Names to Docker Compose Environment Variables
176217

177-
`SECRET_NAME` and `AWS_REGION` are required for the credentials script.
218+
Edit the `harvest-compose.yml` file by adding the "environment" section for each FSxN with the two variables: `SECRET_NAME` and `AWS_REGION`.
219+
These environment variables are required for the credentials script.
178220

221+
For example:
179222
```yaml
180223
services:
181224
fsx01:
@@ -209,33 +252,33 @@ AWS has useful metrics regarding the FSxN file system that ONTAP doesn't provide
209252
an exporter that will expose these metrics. The following steps show how to install a recommended exporter.
210253

211254
##### 5.6.1 Create the yace configuration file.
212-
Use the text in the box below to create the configuration file named `yace-config.yaml`. Replace `<your_region>`, in both places, with the region where your FSxN resides:
213-
255+
Edit the `yace-config.yaml` file found in this repo and replace `<aws_region>`, in both places, with the region where your FSxN resides:
214256
```yaml
215257
apiVersion: v1alpha1
216-
sts-region: <your_region>
258+
sts-region: <aws_region>
217259
discovery:
218260
jobs:
219261
- type: AWS/FSx
220-
regions: [<your_region>]
262+
regions: [<aws_region>]
221263
period: 300
222264
length: 300
223265
metrics:
224266
- name: DiskReadOperations
225-
statistics: [Average]
267+
statistics: [Sum]
226268
- name: DiskWriteOperations
227-
statistics: [Average]
269+
statistics: [Sum]
228270
- name: DiskReadBytes
229-
statistics: [Average]
271+
statistics: [Sum]
230272
- name: DiskWriteBytes
231-
statistics: [Average]
273+
statistics: [Sum]
232274
- name: DiskIopsUtilization
233275
statistics: [Average]
234276
- name: NetworkThroughputUtilization
235277
statistics: [Average]
236278
- name: FileServerDiskThroughputUtilization
237279
statistics: [Average]
238-
280+
- name: CPUUtilization
281+
statistics: [Average]
239282
```
240283
241284
##### 5.6.2 Add Yet-Another-Exporter to harvest-compose.yaml
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
{
2+
"Statement": [
3+
{
4+
"Effect": "Allow",
5+
"Action": [
6+
"secretsmanager:GetSecretValue",
7+
"secretsmanager:DescribeSecret",
8+
"secretsmanager:ListSecrets"
9+
],
10+
"Resource": [
11+
"<your_secret_1_arn>"
12+
]
13+
},
14+
{
15+
"Effect": "Allow",
16+
"Action": [
17+
"tag:GetResources",
18+
"cloudwatch:GetMetricData",
19+
"cloudwatch:GetMetricStatistics",
20+
"cloudwatch:ListMetrics",
21+
"apigateway:GET",
22+
"aps:ListWorkspaces",
23+
"autoscaling:DescribeAutoScalingGroups",
24+
"dms:DescribeReplicationInstances",
25+
"dms:DescribeReplicationTasks",
26+
"ec2:DescribeTransitGatewayAttachments",
27+
"ec2:DescribeSpotFleetRequests",
28+
"shield:ListProtections",
29+
"storagegateway:ListGateways",
30+
"storagegateway:ListTagsForResource",
31+
"iam:ListAccountAliases"
32+
],
33+
"Resource": [
34+
"*"
35+
]
36+
}
37+
],
38+
"Version": "2012-10-17"
39+
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
Exporters:
2+
prometheus1:
3+
exporter: Prometheus
4+
port_range: 12990-14000
5+
add_meta_tags: false
6+
Defaults:
7+
use_insecure_tls: true
8+
Pollers:
9+
fsx01:
10+
datacenter: fsx
11+
addr: <FSxN_ip_1>
12+
collectors:
13+
- Rest
14+
- RestPerf
15+
- Ems
16+
exporters:
17+
- prometheus1
18+
credentials_script:
19+
path: /opt/fetch-credentails
20+
schedule: 3h
21+
timeout: 10s
22+
fsx02:
23+
datacenter: fsx
24+
addr: <FSxN_ip_2>
25+
collectors:
26+
- Rest
27+
- RestPerf
28+
- Ems
29+
exporters:
30+
- prometheus1
31+
credentials_script:
32+
path: /opt/fetch-credentails
33+
schedule: 3h
34+
timeout: 10s
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"Version": "2012-10-17",
3+
"Statement": [
4+
{
5+
"Effect": "Allow",
6+
"Principal": {
7+
"Service": "ec2.amazonaws.com"
8+
},
9+
"Action": "sts:AssumeRole"
10+
}
11+
]
12+
}
13+
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
apiVersion: v1alpha1
2+
sts-region: <aws_region>
3+
discovery:
4+
jobs:
5+
- type: AWS/FSx
6+
regions: [<aws_region>]
7+
period: 300
8+
length: 300
9+
metrics:
10+
- name: DiskReadOperations
11+
statistics: [Sum]
12+
- name: DiskWriteOperations
13+
statistics: [Sum]
14+
- name: DiskReadBytes
15+
statistics: [Sum]
16+
- name: DiskWriteBytes
17+
statistics: [Sum]
18+
- name: DiskIopsUtilization
19+
statistics: [Average]
20+
- name: NetworkThroughputUtilization
21+
statistics: [Average]
22+
- name: FileServerDiskThroughputUtilization
23+
statistics: [Average]
24+
- name: CPUUtilization
25+
statistics: [Average]

0 commit comments

Comments
 (0)