Skip to content

joaovnovais/lambda-glue-cloudwatch-monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AWS Lambda + Glue Automation & Monitoring

This project simulates a real-world scenario of data pipeline automation and monitoring using AWS Lambda, Glue, EventBridge, and CloudWatch.
It demonstrates how to automatically trigger and monitor an AWS Glue job through a serverless architecture.

Overview

Every day at 9:00 AM (BRT), an AWS Lambda function is automatically triggered by Amazon EventBridge to start a Glue Job.
The entire process is monitored using Amazon CloudWatch Logs and CloudWatch Alarms, with an optional email alert via SNS in case of job failure.

Architecture

Workflow:

  1. EventBridge triggers the Lambda function daily at the specified time.
  2. Lambda (Python + Boto3) invokes the Glue Job programmatically.
  3. CloudWatch Logs capture all execution logs.
  4. CloudWatch Alarms monitor job status and send notifications (optionally via SNS).

Technologies Used

  • AWS Lambda – Job orchestration (Python)
  • AWS Glue – Data processing pipeline
  • Amazon EventBridge – Scheduling (cron expression)
  • Amazon CloudWatch Logs – Structured logging and monitoring
  • Amazon CloudWatch Alarms – Failure detection and alerts
  • Amazon SNS (Optional) – Email notifications
  • Python + Boto3 – AWS SDK for Lambda automation

Features

  • Automated triggering of AWS Glue Job via Lambda
  • Daily scheduling with EventBridge cron expression
  • Structured logging with CloudWatch Logs
  • Failure detection using CloudWatch Alarms
  • (Optional) Email alerts through SNS

Lambda Function (Python + Boto3)

import boto3 import logging

logger = logging.getLogger() logger.setLevel(logging.INFO)

def lambda_handler(event, context): glue = boto3.client('glue') job_name = 'job-email-phishing-glue' # Change as needed

try:
    response = glue.start_job_run(JobName=job_name)
    logger.info(f"Job started successfully: {response['JobRunId']}")
except Exception as e:
    logger.error(f"Error starting job: {str(e)}")
    raise e

Key Takeaways

• This project helped me strengthen my understanding of:

• Event-driven and serverless architectures in AWS

• Cloud automation and orchestration with Lambda

• Monitoring, logging, and alerting best practices in CloudWatch

• Real-world data engineering workflow reliability

About

This project implements an automated, serverless data pipeline on AWS. An EventBridge schedule triggers a Lambda function to run an AWS Glue job daily. The entire process is monitored with CloudWatch for robust logging and failure alerts, ensuring a reliable and observable workflow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors