Skip to content

Commit 1462aea

Browse files
author
Chris Fane
committed
Amazon CloudFront Std Logging V2 Sample, demonstrating Kinesis, CloudWatch and S3 Partitioned outputs
1 parent e054ffc commit 1462aea

File tree

9 files changed

+710
-0
lines changed

9 files changed

+710
-0
lines changed
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
# CloudFront V2 Logging with AWS CDK (Python)
2+
3+
This project demonstrates how to set up Amazon CloudFront with the new CloudFront Standard Logging V2 feature using AWS CDK in Python. The example shows how to configure multiple logging destinations for CloudFront access logs, including:
4+
5+
1. Amazon CloudWatch Logs
6+
2. Amazon S3 (with Parquet format)
7+
3. Amazon Kinesis Data Firehose (with JSON format)
8+
9+
## Architecture
10+
11+
![CloudFront V2 Logging Architecture](./architecture.drawio.png)
12+
13+
The project deploys the following resources:
14+
15+
- An S3 bucket to host a simple static website
16+
- A CloudFront distribution with Origin Access Control (OAC) to serve the website
17+
- A logging S3 bucket with appropriate lifecycle policies
18+
- CloudFront Standard Logging V2 configuration with multiple delivery destinations
19+
- Kinesis Data Firehose delivery stream
20+
- CloudWatch Logs group
21+
- Necessary IAM roles and permissions
22+
23+
## Prerequisites
24+
25+
- [AWS CLI](https://aws.amazon.com/cli/) configured with appropriate credentials
26+
- [AWS CDK](https://aws.amazon.com/cdk/) installed (v2.x)
27+
- Python 3.6 or later
28+
- Node.js 14.x or later (for CDK)
29+
30+
## Setup
31+
32+
1. Create and activate a virtual environment:
33+
34+
```bash
35+
python3 -m venv .venv
36+
source .venv/bin/activate # On Windows: .venv\Scripts\activate.bat
37+
```
38+
39+
2. Install the required dependencies:
40+
41+
```bash
42+
pip install -r requirements.txt
43+
```
44+
45+
3. Synthesize the CloudFormation template:
46+
47+
```bash
48+
cdk synth
49+
```
50+
51+
4. Deploy the stack:
52+
53+
```bash
54+
cdk deploy
55+
```
56+
57+
You can customize the log retention periods by providing parameters:
58+
59+
```bash
60+
cdk deploy --parameters LogRetentionDays=90 --parameters CloudWatchLogRetentionDays=60
61+
```
62+
63+
5. After deployment, the CloudFront distribution domain name will be displayed in the outputs. You can access your website using this domain.
64+
65+
## How It Works
66+
67+
This example demonstrates CloudFront Standard Logging V2, which provides more flexibility in how you collect and analyze CloudFront access logs:
68+
69+
- **CloudWatch Logs**: Logs are delivered in JSON format for real-time monitoring and analysis
70+
- **S3 (Parquet)**: Logs are delivered in Parquet format with Hive-compatible paths for efficient querying with services like Amazon Athena
71+
- **Kinesis Data Firehose**: Logs are streamed in JSON format, allowing for real-time processing and transformation
72+
73+
The CDK stack creates all necessary resources and configures the appropriate permissions for log delivery.
74+
75+
## Example Log Outputs
76+
77+
### CloudWatch Logs (JSON format)
78+
```json
79+
{
80+
"timestamp": "2023-03-15T20:12:34Z",
81+
"c-ip": "192.0.2.100",
82+
"time-to-first-byte": 0.002,
83+
"sc-status": 200,
84+
"sc-bytes": 2326,
85+
"cs-method": "GET",
86+
"cs-uri-stem": "/index.html",
87+
"cs-protocol": "https",
88+
"cs-host": "d111111abcdef8.cloudfront.net",
89+
"cs-user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
90+
"cs-referer": "https://www.example.com/",
91+
"x-edge-location": "IAD79-C2",
92+
"x-edge-request-id": "tLAGM_r7TyiRgwgk_4U5Xb-vv4JHOjzGCh61ER9nM_2UFY8hTKdEoQ=="
93+
}
94+
```
95+
96+
### S3 Parquet Format
97+
The Parquet format is a columnar storage format that provides efficient compression and encoding schemes. The logs are stored in a Hive-compatible directory structure:
98+
99+
```
100+
s3://your-logging-bucket/s3_delivery/EDFDVBD6EXAMPLE/2023/03/15/20/
101+
```
102+
103+
### Kinesis Data Firehose (JSON format)
104+
Firehose delivers logs in JSON format with a timestamp-based prefix:
105+
106+
```
107+
s3://your-logging-bucket/firehose_delivery/year=2023/month=03/day=15/delivery-stream-1-2023-03-15-20-12-34-a1b2c3d4.json.gz
108+
```
109+
110+
## Querying Logs with Athena
111+
112+
You can use Amazon Athena to query the Parquet logs stored in S3. Here's an example query to get started:
113+
114+
```sql
115+
CREATE EXTERNAL TABLE IF NOT EXISTS cloudfront_logs (
116+
`timestamp` string,
117+
`c-ip` string,
118+
`time-to-first-byte` float,
119+
`sc-status` int,
120+
`sc-bytes` bigint,
121+
`cs-method` string,
122+
`cs-uri-stem` string,
123+
`cs-protocol` string,
124+
`cs-host` string,
125+
`cs-user-agent` string,
126+
`cs-referer` string,
127+
`x-edge-location` string,
128+
`x-edge-request-id` string
129+
)
130+
PARTITIONED BY (
131+
`distributionid` string,
132+
`year` string,
133+
`month` string,
134+
`day` string,
135+
`hour` string
136+
)
137+
STORED AS PARQUET
138+
LOCATION 's3://your-logging-bucket/s3_delivery/';
139+
140+
-- Update partitions
141+
MSCK REPAIR TABLE cloudfront_logs;
142+
143+
-- Example query to find the top requested URLs
144+
SELECT cs_uri_stem, COUNT(*) as request_count
145+
FROM cloudfront_logs
146+
WHERE year='2023' AND month='03' AND day='15'
147+
GROUP BY cs_uri_stem
148+
ORDER BY request_count DESC
149+
LIMIT 10;
150+
```
151+
152+
## Troubleshooting
153+
154+
### Common Issues
155+
156+
1. **Logs not appearing in CloudWatch**
157+
- Check that the CloudFront distribution is receiving traffic
158+
- Verify the IAM permissions for the log delivery service
159+
- Check CloudWatch service quotas if you have high traffic volumes
160+
161+
2. **Parquet files not appearing in S3**
162+
- Verify bucket permissions allow the log delivery service to write
163+
- Check for any errors in CloudTrail related to log delivery
164+
165+
3. **Firehose delivery errors**
166+
- Check the Firehose error prefix in S3 for error logs
167+
- Verify IAM role permissions for Firehose
168+
- Monitor Firehose metrics in CloudWatch
169+
170+
### Useful Commands
171+
172+
- Check CloudFront distribution status:
173+
```bash
174+
aws cloudfront get-distribution --id <distribution-id>
175+
```
176+
177+
- List log files in S3:
178+
```bash
179+
aws s3 ls s3://your-logging-bucket/s3_delivery/ --recursive
180+
```
181+
182+
- View CloudWatch logs:
183+
```bash
184+
aws logs get-log-events --log-group-name <log-group-name> --log-stream-name <log-stream-name>
185+
```
186+
187+
## Cleanup
188+
189+
To avoid incurring charges, delete the deployed resources when you're done:
190+
191+
```bash
192+
cdk destroy
193+
```
194+
195+
## Security Considerations
196+
197+
This example includes several security best practices:
198+
199+
- S3 buckets are configured with encryption, SSL enforcement, and public access blocking
200+
- CloudFront uses Origin Access Control (OAC) to secure S3 content
201+
- IAM permissions follow the principle of least privilege
202+
- Logging bucket has appropriate lifecycle policies to manage log retention
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
#!/usr/bin/env python3
2+
import os
3+
import aws_cdk as cdk
4+
from aws_cdk import Aspects
5+
from cdk_nag import AwsSolutionsChecks, NagSuppressions
6+
7+
from cloudfront_v2_logging.cloudfront_v2_logging_stack import CloudfrontV2LoggingStack
8+
9+
app = cdk.App()
10+
stack = CloudfrontV2LoggingStack(app, "CloudfrontV2LoggingStack")
11+
12+
# Add CDK-NAG to check for best practices
13+
Aspects.of(app).add(AwsSolutionsChecks())
14+
15+
# Add suppressions at the stack level
16+
NagSuppressions.add_stack_suppressions(
17+
stack,
18+
[
19+
{
20+
"id": "AwsSolutions-IAM4",
21+
"reason": "Suppressing managed policy warning as permissions are appropriate"
22+
},
23+
{
24+
"id": "AwsSolutions-L1",
25+
"reason": "Lambda runtime is 3.11 and managed by CDK BucketDeployment construct, and so out of scope for this project"
26+
},
27+
{
28+
"id": "AwsSolutions-CFR1",
29+
"reason": "Geo restrictions not required for this demo"
30+
},
31+
{
32+
"id": "AwsSolutions-CFR2",
33+
"reason": "WAF integration not required for this demo"
34+
},
35+
{
36+
"id": "AwsSolutions-CFR3",
37+
"reason": "Using CloudFront V2 logging instead of traditional access logging"
38+
},
39+
{
40+
"id": "AwsSolutions-S1",
41+
"reason": "S3 access logging not required for this demo as we're demonstrating CloudFront V2 logging"
42+
},
43+
{
44+
"id": "AwsSolutions-IAM5",
45+
"reason": "Wildcard permissions are required for PUT actions for the CDK BucketDeployment construct and Firehose role"
46+
},
47+
{
48+
"id": "AwsSolutions-CFR4",
49+
"reason": "Using TLSv1.2_2021 security policy which is the latest supported version."
50+
}
51+
]
52+
)
53+
54+
app.synth()
52.2 KB
Loading
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
{
2+
"app": "python3 app.py",
3+
"watch": {
4+
"include": [
5+
"**"
6+
],
7+
"exclude": [
8+
"README.md",
9+
"cdk*.json",
10+
"requirements*.txt",
11+
"source.bat",
12+
"**/__init__.py",
13+
"**/__pycache__",
14+
"tests"
15+
]
16+
},
17+
"context": {
18+
"@aws-cdk/aws-lambda:recognizeLayerVersion": true,
19+
"@aws-cdk/core:checkSecretUsage": true,
20+
"@aws-cdk/core:target-partitions": [
21+
"aws",
22+
"aws-cn"
23+
],
24+
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
25+
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
26+
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
27+
"@aws-cdk/aws-iam:minimizePolicies": true,
28+
"@aws-cdk/core:validateSnapshotRemovalPolicy": true,
29+
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
30+
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
31+
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
32+
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
33+
"@aws-cdk/core:enablePartitionLiterals": true,
34+
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
35+
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
36+
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
37+
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
38+
"@aws-cdk/aws-route53-patters:useCertificate": true,
39+
"@aws-cdk/customresources:installLatestAwsSdkDefault": false,
40+
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
41+
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
42+
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
43+
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
44+
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
45+
"@aws-cdk/aws-redshift:columnId": true,
46+
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
47+
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
48+
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
49+
"@aws-cdk/aws-kms:aliasNameRef": true,
50+
"@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true,
51+
"@aws-cdk/core:includePrefixInUniqueNameGeneration": true,
52+
"@aws-cdk/aws-efs:denyAnonymousAccess": true,
53+
"@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true,
54+
"@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true,
55+
"@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true,
56+
"@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true,
57+
"@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true,
58+
"@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true,
59+
"@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true,
60+
"@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true,
61+
"@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
62+
"@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
63+
"@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
64+
"@aws-cdk/aws-eks:nodegroupNameAttribute": true,
65+
"@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true,
66+
"@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true,
67+
"@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false,
68+
"@aws-cdk/aws-s3:keepNotificationInImportedBucket": false,
69+
"@aws-cdk/aws-ecs:enableImdsBlockingDeprecatedFeature": false,
70+
"@aws-cdk/aws-ecs:disableEcsImdsBlocking": true,
71+
"@aws-cdk/aws-ecs:reduceEc2FargateCloudWatchPermissions": true,
72+
"@aws-cdk/aws-dynamodb:resourcePolicyPerReplica": true,
73+
"@aws-cdk/aws-ec2:ec2SumTImeoutEnabled": true,
74+
"@aws-cdk/aws-appsync:appSyncGraphQLAPIScopeLambdaPermission": true,
75+
"@aws-cdk/aws-rds:setCorrectValueForDatabaseInstanceReadReplicaInstanceResourceId": true,
76+
"@aws-cdk/core:cfnIncludeRejectComplexResourceUpdateCreatePolicyIntrinsics": true,
77+
"@aws-cdk/aws-lambda-nodejs:sdkV3ExcludeSmithyPackages": true,
78+
"@aws-cdk/aws-stepfunctions-tasks:fixRunEcsTaskPolicy": true,
79+
"@aws-cdk/aws-ec2:bastionHostUseAmazonLinux2023ByDefault": true,
80+
"@aws-cdk/aws-route53-targets:userPoolDomainNameMethodWithoutCustomResource": true,
81+
"@aws-cdk/aws-elasticloadbalancingV2:albDualstackWithoutPublicIpv4SecurityGroupRulesDefault": true,
82+
"@aws-cdk/aws-iam:oidcRejectUnauthorizedConnections": true,
83+
"@aws-cdk/core:enableAdditionalMetadataCollection": true,
84+
"@aws-cdk/aws-lambda:createNewPoliciesWithAddToRolePolicy": true
85+
}
86+
}

python/cloudfront-v2-logging/cloudfront_v2_logging/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)