Skip to content

Commit 716d85d

Browse files
committed
DevOps-Cheatsheet: Add new version control and monitoring tools to DevOps cheatsheet
- Added new cheatsheets for version control: - GitLab.md - Bitbucket.md - GitHub.md - Added CloudWatch.md to the Monitoring category. Signed-off-by: NotHarshhaa <[email protected]>
1 parent a689daf commit 716d85d

File tree

4 files changed

+1406
-0
lines changed

4 files changed

+1406
-0
lines changed

Monitoring/CloudWatch.md

Lines changed: 318 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,318 @@
1+
# CloudWatch Cheatsheet
2+
3+
![text](https://imgur.com/BU5g7ce.png)
4+
5+
Amazon CloudWatch is a comprehensive monitoring and management service designed for AWS and hybrid cloud applications. This guide covers everything from basic concepts to advanced configurations, helping you leverage CloudWatch for performance monitoring, troubleshooting, and operational insights.
6+
7+
---
8+
9+
## **1. Introduction to CloudWatch**
10+
11+
### What is CloudWatch?
12+
13+
- Amazon CloudWatch is a monitoring and observability service for AWS resources and custom applications.
14+
- Provides actionable insights through metrics, logs, alarms, and dashboards.
15+
- Supports both infrastructure and application-level monitoring.
16+
17+
### Key Features:
18+
19+
- **Metrics**: Collect and monitor key performance data.
20+
- **Logs**: Aggregate, analyze, and search logs.
21+
- **Alarms**: Set thresholds for metrics to trigger automated actions.
22+
- **Dashboards**: Visualize data in real time.
23+
- **CloudWatch Events**: Trigger actions based on changes in AWS resources.
24+
25+
---
26+
27+
## **2. CloudWatch Architecture Overview**
28+
29+
- **Data Sources**:
30+
- AWS Services: EC2, RDS, Lambda, etc.
31+
- On-premises servers or hybrid setups using CloudWatch Agent.
32+
- **Core Components**:
33+
- **Metrics**: Quantifiable data points (e.g., CPU utilization).
34+
- **Logs**: Application and system logs.
35+
- **Alarms**: Notifications or automated responses.
36+
- **Dashboards**: Custom visualizations.
37+
- **Insights**: Advanced log analytics.
38+
39+
---
40+
41+
## **3. Setting Up CloudWatch**
42+
43+
### Accessing CloudWatch
44+
45+
1. Go to the **AWS Management Console**.
46+
2. Navigate to **CloudWatch** under the **Management & Governance** section.
47+
48+
### CloudWatch Agent Installation
49+
50+
To monitor custom metrics or on-premises resources:
51+
52+
1. Install the CloudWatch Agent on your instance:
53+
54+
```bash
55+
sudo yum install amazon-cloudwatch-agent
56+
```
57+
58+
2. Configure the agent:
59+
60+
```bash
61+
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
62+
```
63+
64+
3. Start the agent:
65+
66+
```bash
67+
sudo /opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
68+
```
69+
70+
### Setting IAM Permissions
71+
72+
Attach the **CloudWatchFullAccess** policy to the IAM role or user managing CloudWatch.
73+
74+
---
75+
76+
## **4. Metrics Monitoring**
77+
78+
### Viewing Metrics
79+
80+
1. In the CloudWatch console, go to **Metrics**.
81+
2. Select a namespace (e.g., `AWS/EC2`, `AWS/Lambda`).
82+
3. Choose metrics like `CPUUtilization`, `DiskWriteOps`, etc.
83+
84+
### Common Metrics:
85+
86+
- **EC2**:
87+
- `CPUUtilization`
88+
- `DiskReadBytes`
89+
- `NetworkIn/Out`
90+
- **RDS**:
91+
- `DatabaseConnections`
92+
- `ReadIOPS`
93+
- `WriteLatency`
94+
- **Lambda**:
95+
- `Invocations`
96+
- `Duration`
97+
- `Errors`
98+
99+
### Custom Metrics
100+
101+
To send custom metrics:
102+
103+
1. Install the AWS CLI.
104+
2. Publish a metric:
105+
106+
```bash
107+
aws cloudwatch put-metric-data --namespace "CustomNamespace" --metric-name "MetricName" --value 100
108+
```
109+
110+
---
111+
112+
## **5. CloudWatch Logs**
113+
114+
### Setting Up Log Groups and Streams
115+
116+
1. Navigate to **Logs** in the CloudWatch console.
117+
2. Create a **Log Group** (e.g., `/aws/lambda/my-function`).
118+
3. Each application/service writes to a **Log Stream** under the group.
119+
120+
### Exporting Logs to S3
121+
122+
1. Go to **Logs** → Select a log group.
123+
2. Click **Actions****Export data to Amazon S3**.
124+
3. Configure the export with the desired time range.
125+
126+
### Querying Logs with CloudWatch Logs Insights
127+
128+
1. Navigate to **Logs Insights**.
129+
2. Write queries for analysis:
130+
131+
```sql
132+
fields @timestamp, @message
133+
| filter @message like /ERROR/
134+
| sort @timestamp desc
135+
| limit 20
136+
```
137+
138+
---
139+
140+
## **6. CloudWatch Alarms**
141+
142+
### Creating an Alarm
143+
144+
1. Go to **Alarms** in the CloudWatch console.
145+
2. Click **Create Alarm**.
146+
3. Select a metric (e.g., `CPUUtilization`).
147+
4. Set a threshold (e.g., `> 80%` for 5 minutes).
148+
5. Choose an action (e.g., send an SNS notification).
149+
150+
### Alarm States:
151+
152+
- **OK**: Metric is within the defined threshold.
153+
- **ALARM**: Metric breaches the threshold.
154+
- **INSUFFICIENT DATA**: No data available.
155+
156+
### Advanced Alarm Configurations
157+
158+
- Composite Alarms: Combine multiple alarms.
159+
- Actions:
160+
- Notify via SNS.
161+
- Trigger Lambda functions.
162+
- Stop/start EC2 instances.
163+
164+
---
165+
166+
## **7. CloudWatch Dashboards**
167+
168+
### Creating a Dashboard
169+
170+
1. Go to **Dashboards** in the CloudWatch console.
171+
2. Click **Create Dashboard**.
172+
3. Add widgets:
173+
- **Line** for metrics.
174+
- **Number** for single values.
175+
- **Text** for notes.
176+
177+
### Customizing Widgets
178+
179+
- Choose metrics from different namespaces.
180+
- Configure time ranges and granularity.
181+
182+
### Example: Multi-Service Dashboard
183+
184+
- **EC2 Metrics**: CPU, Disk, Network.
185+
- **RDS Metrics**: Connections, IOPS.
186+
- **Lambda Metrics**: Invocations, Errors.
187+
188+
---
189+
190+
## **8. CloudWatch Events (EventBridge)**
191+
192+
### Creating Rules
193+
194+
1. Navigate to **Rules** under **Events** in the CloudWatch console.
195+
2. Create a rule with an event pattern (e.g., EC2 state change).
196+
3. Add a target (e.g., SNS, Lambda, Step Functions).
197+
198+
### Example: Automate Instance Shutdown
199+
200+
1. Event Pattern:
201+
202+
```json
203+
{
204+
"source": ["aws.ec2"],
205+
"detail-type": ["EC2 Instance State-change Notification"],
206+
"detail": {
207+
"state": ["stopped"]
208+
}
209+
}
210+
```
211+
212+
2. Target: Send an SNS notification.
213+
214+
---
215+
216+
## **9. Advanced Configurations**
217+
218+
### Cross-Account Monitoring
219+
220+
1. Create a cross-account role with permissions to access CloudWatch in the target account.
221+
2. Use the `CloudWatch:ListMetrics` and `CloudWatch:GetMetricData` APIs.
222+
223+
### Anomaly Detection
224+
225+
Enable anomaly detection for metrics:
226+
227+
1. Go to **Metrics** → Select a metric.
228+
2. Click **Actions****Enable anomaly detection**.
229+
230+
### Metric Math
231+
232+
Perform calculations across metrics:
233+
234+
- Example: Combine CPU utilization across instances.
235+
236+
```bash
237+
(m1+m2)/2
238+
```
239+
240+
---
241+
242+
## **10. Integration with Other Services**
243+
244+
### AWS Lambda
245+
246+
- Use `console.log()` to write logs to CloudWatch.
247+
- Monitor Lambda-specific metrics like `Errors` and `Throttles`.
248+
249+
### ECS/EKS
250+
251+
- Enable CloudWatch Container Insights for detailed monitoring.
252+
- Use `awslogs` driver to send container logs to CloudWatch.
253+
254+
### Integration with Third-Party Tools
255+
256+
- Use **DataDog** or **Grafana** for enhanced visualization.
257+
- Integrate CloudWatch metrics into these platforms using APIs.
258+
259+
---
260+
261+
## **11. Security Best Practices**
262+
263+
### Log Retention
264+
265+
- Set retention policies for logs to reduce costs:
266+
267+
```bash
268+
aws logs put-retention-policy --log-group-name "/aws/lambda/my-function" --retention-in-days 30
269+
```
270+
271+
### Fine-Grained Access Control
272+
273+
- Use IAM policies to restrict access to specific metrics, logs, or dashboards.
274+
275+
---
276+
277+
## **12. CloudWatch Pricing**
278+
279+
### Pricing Model
280+
281+
1. **Metrics**: Charged per metric, per month.
282+
2. **Logs**:
283+
- Ingestion: Cost per GB ingested.
284+
- Storage: Cost per GB stored.
285+
3. **Dashboards**: Charged per dashboard, per month.
286+
287+
### Cost Optimization Tips
288+
289+
- Use metric filters to limit data collection.
290+
- Set shorter retention periods for logs.
291+
292+
---
293+
294+
## **13. Best Practices**
295+
296+
1. **Organize Log Groups**:
297+
- Use consistent naming conventions (e.g., `/application/environment/service`).
298+
299+
2. **Use Alarms Wisely**:
300+
- Avoid too many alarms to prevent alert fatigue.
301+
- Use composite alarms to group related metrics.
302+
303+
3. **Automate Monitoring**:
304+
- Automate alert creation and dashboards using CloudFormation or Terraform.
305+
306+
4. **Optimize Log Storage**:
307+
- Export logs to S3 for long-term storage and analysis.
308+
309+
5. **Enable Anomaly Detection**:
310+
- Automate anomaly detection for critical metrics.
311+
312+
---
313+
314+
## **14. References and Resources**
315+
316+
- [CloudWatch Documentation](https://docs.aws.amazon.com/cloudwatch/)
317+
- [Metric Math Syntax Guide](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html)
318+
- [CloudWatch Pricing](https://aws.amazon.com/cloudwatch/pricing/)

0 commit comments

Comments
 (0)