This project simulates a real-world scenario of data pipeline automation and monitoring using AWS Lambda, Glue, EventBridge, and CloudWatch.
It demonstrates how to automatically trigger and monitor an AWS Glue job through a serverless architecture.
Every day at 9:00 AM (BRT), an AWS Lambda function is automatically triggered by Amazon EventBridge to start a Glue Job.
The entire process is monitored using Amazon CloudWatch Logs and CloudWatch Alarms, with an optional email alert via SNS in case of job failure.
Workflow:
- EventBridge triggers the Lambda function daily at the specified time.
- Lambda (Python + Boto3) invokes the Glue Job programmatically.
- CloudWatch Logs capture all execution logs.
- CloudWatch Alarms monitor job status and send notifications (optionally via SNS).
- AWS Lambda – Job orchestration (Python)
- AWS Glue – Data processing pipeline
- Amazon EventBridge – Scheduling (cron expression)
- Amazon CloudWatch Logs – Structured logging and monitoring
- Amazon CloudWatch Alarms – Failure detection and alerts
- Amazon SNS (Optional) – Email notifications
- Python + Boto3 – AWS SDK for Lambda automation
- Automated triggering of AWS Glue Job via Lambda
- Daily scheduling with EventBridge cron expression
- Structured logging with CloudWatch Logs
- Failure detection using CloudWatch Alarms
- (Optional) Email alerts through SNS
import boto3 import logging
logger = logging.getLogger() logger.setLevel(logging.INFO)
def lambda_handler(event, context): glue = boto3.client('glue') job_name = 'job-email-phishing-glue' # Change as needed
try:
response = glue.start_job_run(JobName=job_name)
logger.info(f"Job started successfully: {response['JobRunId']}")
except Exception as e:
logger.error(f"Error starting job: {str(e)}")
raise e
• This project helped me strengthen my understanding of:
• Event-driven and serverless architectures in AWS
• Cloud automation and orchestration with Lambda
• Monitoring, logging, and alerting best practices in CloudWatch
• Real-world data engineering workflow reliability