-
Notifications
You must be signed in to change notification settings - Fork 7
update harvest solution #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
977e5e2
update harvest solution
adizalmanovich1 6c9794d
update harvest files
adizalmanovich1 0530cb1
update harvest files
adizalmanovich1 024eebe
update harvest files
adizalmanovich1 a29adbc
update harvest files
adizalmanovich1 68bcea3
update harvest files
adizalmanovich1 8b333f8
update harvest files
adizalmanovich1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,232 @@ | ||
| # Deploy NetApp Harvest on EKS | ||
|
|
||
| Harvest helm chart for monitoring Amazon multiple FSxN on existing monitoring stack and integrating AWS Secret Manager for FSxN credentails. | ||
|
|
||
|
|
||
|
|
||
| ## Introduction | ||
|
|
||
| ### What to expect | ||
|
|
||
| Harvest Helm chart installation will result the following: | ||
| * Install NetApp Harvest with latest version on your EKS | ||
| * Each FSxN cluster will represent as kubernetes pod on the cluster. | ||
| * Collecting metrics about your FSxNs and adding existing Grafana dashboards for better visualizion. | ||
|
|
||
| ### Prerequisites | ||
| * `Helm` - for reources installation | ||
| * NetApp FSxN running on the same EKS vpc. | ||
| * Existing `Promethues` running on your EKS cluster. | ||
| * Existing `Grafana` running on your EKS cluster. | ||
| * Existing `Secret Manager` on the same FSxN region. | ||
|
|
||
|
|
||
| ### Deployment | ||
| ### User Input | ||
|
|
||
| Parameter | Description | | ||
| --- | --- | | ||
| fsxs.clusters.name | FSxN cluster name | | ||
| fsxs.clusters.managment_lif | FSxN for NetApp ONTAP filesystem management IP | | ||
| fsxs.clusters.secretName | AWS Secret Manager for FSxN credentials | | ||
| fsxs.clusters.region | FSxN and AWS Secret Manager region | | ||
| fsxs.clusters.promPort | Which port harvest will be created and exposed to Promethues | | ||
| promethues | Existing Promethues name for discovering | | ||
|
|
||
| ### Integration with AWS Secret Manager | ||
|
|
||
| The installation supports integration with AWS Secret Manager. You can store your FSxN credentials by using existing or new AWS Secret Manager. | ||
| Harvest will invoke script specified in the credentials_script path section which already mapped to Harvest container. | ||
| Harvest uses ServiceAccount with permissions to fetch the secrets. | ||
| Credentails script expect to fetch `USERNAME`, `PASSWORD` values from Secret Manager. | ||
| ServiceAccount should be created during the installation with the sufficient permissions. | ||
|
|
||
|
|
||
| ### Monitoring multiples FSxN | ||
|
|
||
| The Helm chart supports monitoring multiple FSxNs. | ||
| You can add multiples FSxNs by configure it on `values.yaml`: | ||
| For example: | ||
| ``` | ||
| fsxs: | ||
| clusters: | ||
| - name: fsx1 | ||
| managment_lif: 1.1.1.1 | ||
| promPort: 12990 | ||
| secretName: secret1 | ||
| region: us-east-1 | ||
| - name: fsx2 | ||
| managment_lif: 1.1.1.1 | ||
| promPort: 12990 | ||
| secretName: secret2 | ||
| region: us-east-1 | ||
| ``` | ||
|
|
||
| ### Installation | ||
| Install Harvest helm chart from this GitHub repository. The custom Helm chart includes: | ||
| * `deplyment.yaml` - Harvest deployment using Harvest latest version image | ||
| * `harvest-config.yaml` - Harvest backend configuration | ||
| * `harvest-cm.yaml` - Environment variables configuration for credentails script. | ||
| * `service-monitor.yaml` - Promethues ServiceMonitor for collecting Harvest metrics. | ||
|
|
||
| 1. **(optional) Create AWS secret manager** | ||
| ``` | ||
| aws secretsmanager create-secret \ | ||
| --region <REGION> \ | ||
| --name <SECRET_NAME> \ | ||
| --secret-string '{"USERNAME":"'fsxadmin'", "PASSWORD":"'<YOUR_FSX_PASSWORD'"} | ||
| ``` | ||
|
|
||
| 2. **Create ServiceAccount with permissions to AWS Secret Manager** | ||
|
|
||
| **Create Policy with permissions to AWS secretsmanager:** | ||
|
|
||
| The following IAM policy can be used to grant the all permissions required by Harvest to fetch the secrets: | ||
|
|
||
| ``` | ||
| { | ||
| "Statement": [ | ||
| { | ||
| "Action": [ | ||
| "secretsmanager:GetSecretValue", | ||
| "secretsmanager:DescribeSecret", | ||
| "secretsmanager:ListSecrets" | ||
| ], | ||
| "Effect": "Allow", | ||
| "Resource": [ | ||
| "<your_secret_manager_arn_1>", | ||
| "<your_secret_manager_arn_2>" | ||
| ] | ||
| } | ||
| ], | ||
| "Version": "2012-10-17" | ||
| } | ||
|
|
||
| ``` | ||
| * keep the POLICY_ARN for the ServiceAccount creation. | ||
|
|
||
|
|
||
| **Create ServiceAccount**: | ||
|
|
||
| **note**: namespace should be already exists\ | ||
| if not exist use the following command: | ||
| ``` | ||
| kubectl create ns <NAMESPACE> | ||
| ``` | ||
| ``` | ||
| eksctl create iamserviceaccount --name harvest-sa --region=<REGION> --namespace <NAMESPACE> --role-name harvest-role --cluster <YOUR_CLUSTER_NAME> --attach-policy-arn "<POLICY_ARN>" --approve | ||
| ``` | ||
|
|
||
| 3. **Install Harvest helm chart** | ||
| ```text | ||
| helm upgrade --install harvest -f values.yaml ./ --namespace=<NAMESPACE> --set promethues=<your_promethues_release_name> | ||
| ``` | ||
|
|
||
| Once the deployment is complete, Harvest should be listed as a target on Promethues. | ||
|
|
||
| ### Import FSxN CloudWatch metrics into your monitoring stack | ||
| AWS provides more metrics which cannot be collected by Harvest. | ||
| We recommand to use yet-another-exporter (by Promethues community) for collecting metrics from CloudWatch. see: https://github.com/nerdswords/helm-charts | ||
|
|
||
| #### Installation #### | ||
| 1. **Create ServiceAccount with permissions to AWS CloudWatch** | ||
| The following IAM policy can be used to grant the all permissions required by yet-another-exporter to fetch the CloudWatch metrics: | ||
|
|
||
| ``` | ||
| { | ||
| "Version": "2012-10-17", | ||
| "Statement": [ | ||
| { | ||
| "Action": [ | ||
| "tag:GetResources", | ||
| "cloudwatch:GetMetricData", | ||
| "cloudwatch:GetMetricStatistics", | ||
| "cloudwatch:ListMetrics", | ||
| "apigateway:GET", | ||
| "aps:ListWorkspaces", | ||
| "autoscaling:DescribeAutoScalingGroups", | ||
| "dms:DescribeReplicationInstances", | ||
| "dms:DescribeReplicationTasks", | ||
| "ec2:DescribeTransitGatewayAttachments", | ||
| "ec2:DescribeSpotFleetRequests", | ||
| "shield:ListProtections", | ||
| "storagegateway:ListGateways", | ||
| "storagegateway:ListTagsForResource" | ||
| ], | ||
| "Effect": "Allow", | ||
| "Resource": "*" | ||
| } | ||
| ] | ||
| } | ||
|
|
||
| ``` | ||
| Run the following command in order to create the policy: | ||
|
|
||
| POLICY_ARN=$(aws iam create-policy --policy-name yace-exporter-policy --policy-document file://yace-exporter-policy.json --query Policy.Arn --output text) | ||
|
|
||
| 2. **Create ServiceAccount**: | ||
|
|
||
| **note**: namespace should be already exists\ | ||
| if not exist use the following command: | ||
| ``` | ||
| kubectl create ns <NAMESPACE> | ||
| ``` | ||
| ``` | ||
| eksctl create iamserviceaccount --name yace-exporter-sa --region=<REGION> --namespace <NAMESPACE> --role-name yace-cloudwatch-exporter-role --cluster <YOUR_CLUSTER_NAME> --attach-policy-arn "$POLICY_ARN" --approve | ||
| ``` | ||
|
|
||
| 3. **Install yace-exporter helm chart** | ||
|
|
||
| ```text | ||
| helm repo add nerdswords https://nerdswords.github.io/helm-charts | ||
| ``` | ||
|
|
||
| Change the promethues release name for ServiceMonitor creation on yace-override-values.yaml: | ||
| ``` | ||
| serviceMonitor: | ||
| enabled: true | ||
| labels: | ||
| release: <Promethues_Name> | ||
| ``` | ||
|
|
||
| Apply the region name to FSxN's region on yace-override-values.yaml: | ||
| ``` | ||
| apiVersion: v1alpha1 | ||
| sts-region: <Region_Name> | ||
| discovery: | ||
| jobs: | ||
| - type: AWS/FSx | ||
| regions: | ||
| - <Region_Name> | ||
| period: 300 | ||
| length: 300 | ||
| metrics: | ||
| - name: DiskReadOperations | ||
| statistics: [Average] | ||
| - name: DiskWriteOperations | ||
| statistics: [Average] | ||
| - name: DiskReadBytes | ||
| statistics: [Average] | ||
| - name: DiskWriteBytes | ||
| statistics: [Average] | ||
| - name: DiskIopsUtilization | ||
| statistics: [Average] | ||
| - name: NetworkThroughputUtilization | ||
| statistics: [Average] | ||
| - name: FileServerDiskThroughputUtilization | ||
| statistics: [Average] | ||
| ``` | ||
| NOTE: if you changed the `ServiceAccount` or `role` name, please update it on yace-override-values.yaml. | ||
|
|
||
| ```text | ||
| helm install yet-another-cloudwatch-exporter nerdswords/yet-another-cloudwatch-exporter -f yace-override-values.yaml -n <namespace> | ||
| ``` | ||
|
|
||
|
|
||
|
|
||
| ### Adding Grafana dashboards and visualize your FSxN metrics on Grafana | ||
| Import existing dashboards into your Grafana: | ||
| * [How to import Grafana dashboards](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) | ||
| * Example dashboards for Grafana are located in the dashboards folder | ||
| #### Note | ||
| fsxadmin user does not have a full permission to collect all metrics by default. |
145 changes: 145 additions & 0 deletions
145
Monitoring/monitor_fsxn_with_grafana/cloudformation/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,145 @@ | ||
| # Harvest and Grafana Deployment using AWS CloudFormation | ||
|
|
||
| This guide provides instructions to deploy the Harvest and Grafana environment to monitor your Amazon FSx for NetApp ONTAP resources. The deployment process takes about five minutes. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| Before you start, ensure you have the following: | ||
| - An FSx for ONTAP file system running in an Amazon Virtual Private Cloud (Amazon VPC) in your AWS account. | ||
| - The parameter information for the template. | ||
|
|
||
| ## Yet Another CloudWatch Exporter (YACE) | ||
|
|
||
| YACE, or Yet Another CloudWatch Exporter, is a Prometheus exporter for AWS CloudWatch metrics. It is written in Go and uses the official AWS SDK. YACE supports auto-discovery of resources via tags, structured logging, filtering monitored resources via regex, and more[1](https://github.com/prometheus-community/yet-another-cloudwatch-exporter). This deployment includes YACE to enhance monitoring capabilities for your FSx for ONTAP resources. | ||
|
|
||
| ## Overview | ||
|
|
||
| This deployment includes: | ||
| - **Yet Another CloudWatch Exporter (YACE)**: Collects FSxN CloudWatch metrics. | ||
| - **Harvest**: Collects ONTAP metrics. | ||
|
|
||
| ## Deployment Steps | ||
|
|
||
| 1. **Download the Template** | ||
| - Download the `fsx-ontap-harvest-grafana.template` AWS CloudFormation template. | ||
|
|
||
| 2. **Create the Stack** | ||
| - Open the AWS CloudFormation console. | ||
| - Choose **Create stack** and upload the `fsx-ontap-harvest-grafana.template` file. | ||
|
|
||
| 3. **Specify Stack Details** | ||
| - **Parameters**: Review and modify the parameters as needed for your file system. The default values are: | ||
| - **InstanceType**: `t3.micro` (Other options: `t3.small`, `t3.medium`, `t3.large`, `t3.xlarge`, `t3.2xlarge`, etc.) | ||
| - **KeyPair**: No default value. Specify the key pair to access the EC2 instance. | ||
| - **SecurityGroup**: No default value. Ensure inbound ports 3000 and 9090 are open. | ||
| - **SubnetType**: No default value. Choose `public` or `private`. | ||
| - **Subnet**: No default value. Specify the same subnet as your FSx for ONTAP file system's preferred subnet. | ||
| - **LatestLinuxAmiId**: `/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2` | ||
| - **FSxEndPoint**: No default value. Specify the management endpoint IP address of your FSx file system. | ||
| - **SecretName**: No default value. Specify the AWS Secrets Manager secret name containing the password for the `fsxadmin` user. | ||
|
|
||
| 4. **Configure Stack Options** | ||
| - Choose **Next** for stack options. | ||
|
|
||
| 5. **Review and Create** | ||
| - Review the stack details and confirm the settings. | ||
| - Select the check box to acknowledge that the template creates IAM resources. | ||
| - Choose **Create stack**. | ||
|
|
||
| 6. **Monitor Stack Creation** | ||
| - Monitor the status of the stack in the AWS CloudFormation console. The status should change to `CREATE_COMPLETE` in about five minutes. | ||
|
|
||
| ## Accessing Grafana | ||
|
|
||
| - After the deployment is complete, log in to the Grafana dashboard using your browser: | ||
| - URL: `http://<EC2_instance_IP>:3000` | ||
| - Default credentials: | ||
| - Username: `admin` | ||
| - Password: `admin` | ||
| - **Note**: Change your password immediately after logging in. | ||
|
|
||
| ## Supported Harvest Dashboards | ||
|
|
||
| Amazon FSx for NetApp ONTAP exposes a different set of metrics than on-premises NetApp ONTAP. Therefore, only the following out-of-the-box Harvest dashboards tagged with `fsx` are currently supported for use with FSx for ONTAP. Some panels in these dashboards may be missing information that is not supported: | ||
|
|
||
| - **FSxN_Clusters** | ||
| - **FSxN_CW_Utilization** | ||
| - **FSxN_Data_protection** | ||
| - **FSxN_LUN** | ||
| - **FSxN_SVM** | ||
| - **FSxN_Volume** | ||
|
|
||
| --- | ||
|
|
||
| ## Monitor More FSxN | ||
|
|
||
| To monitor additional FSxN resources, follow these steps: | ||
|
|
||
| 1. **Move to the Harvest Directory** | ||
| - Navigate to the Harvest directory: | ||
| ```bash | ||
| cd /opt/harvest | ||
| ``` | ||
|
|
||
| 2. **Configure Additional FSxN in `harvest.yml`** | ||
| - Edit the `harvest.yml` file to add the new FSxN configuration. For example: | ||
| ```yaml | ||
| fsx02: | ||
| datacenter: fsx | ||
| addr: <FSxN_ip_2> | ||
| collectors: | ||
| - Rest | ||
| - RestPerf | ||
| - Ems | ||
| exporters: | ||
| - prometheus1 | ||
| credentials_script: | ||
| path: /opt/fetch-credentials | ||
| schedule: 3h | ||
| timeout: 10s | ||
| ``` | ||
|
|
||
| 3. **Update `harvest-compose` with the Additional FSxN** | ||
| - Edit the `harvest-compose.yml` file to include the new FSxN configuration: | ||
| ```yaml | ||
| fsx02: | ||
| image: ghcr.io/tlvdevops/harvest-fsx:latest | ||
| container_name: poller-fsx02 | ||
| restart: unless-stopped | ||
| ports: | ||
| - "12991:12991" | ||
| command: '--poller fsx02 --promPort 12991 --config /opt/harvest.yml' | ||
| volumes: | ||
| - ./cert:/opt/harvest/cert | ||
| - ./harvest.yml:/opt/harvest.yml | ||
| - ./conf:/opt/harvest/conf | ||
| environment: | ||
| - SECRET_NAME=<your_secret_2> | ||
| - AWS_REGION=<your_region> | ||
| ``` | ||
| - **Note**: Change the `container_name`, `ports`, `promPort`, and `SECRET_NAME` as needed. | ||
|
|
||
| 4. **Add FSxN to Prometheus Targets** | ||
| - Edit the `harvest_targets.yml` file to add the new FSxN target: | ||
| ```yaml | ||
| - targets: ['<container_name>:<container-port>'] | ||
| ``` | ||
|
|
||
| 5. **Restart Docker Compose** | ||
| - Bring down the Docker Compose stack: | ||
| ```bash | ||
| docker compose -f prom-stack.yml -f harvest-compose.yml down ``` | ||
| - Bring the Docker Compose stack back up: | ||
| ```bash | ||
| docker compose -f prom-stack.yml -f harvest-compose.yml up -d --remove-orphans | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| Feel free to adjust the placeholders (`<FSxN_ip_2>`, `<your_secret_2>`, `<your_region>`, `<container_name>`, `<container-port>`) with your specific details. | ||
| ## Additional Information | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| [1](https://github.com/prometheus-community/yet-another-cloudwatch-exporter): [Yet Another CloudWatch Exporter on GitHub](https://github.com/prometheus-community/yet-another-cloudwatch-exporter) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it isn't clear what "SubnetType" does. Maybe add a note saying that selecting "public" means that a public IP will be allocated to the EC2 instance. You could update the description of the variable to match that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have taken it from amazon doc.
will add more description about it