360-degree observability solution for data platforms - deployed on AWS Free Tier
Combines technical observability (infrastructure) with data observability (data quality) in one unified solution:
- Infrastructure monitoring (Prometheus, Grafana, CloudWatch)
- Data quality checks (freshness, volume, schema, nulls, lineage)
- Unified dashboards showing both layers
- Real-time alerting for data and infrastructure issues
- 100% AWS Free Tier eligible
EC2 (t2.micro) -> Docker -> Prometheus + Grafana
|
Lambda -> Data Quality Checks -> CloudWatch Metrics
|
S3 Data Lake + Glue Catalog
Uses only AWS Free Tier services:
- EC2 t2.micro (750 hrs/month)
- S3 (5 GB)
- Lambda (1M requests)
- CloudWatch (10 metrics)
Condition: Deploy < 2 hours, destroy everything
# 1. Clone
git clone https://github.com/YOUR_USERNAME/aws-data-observability-poc.git
cd aws-data-observability-poc
# 2. Configure
cp terraform/terraform.tfvars.example terraform/terraform.tfvars
# Edit terraform.tfvars with your values
# 3. Deploy
./scripts/deploy.sh
# 4. Access services (wait 5 mins)
# Grafana: http://EC2_IP:3000 (admin/admin)
# Prometheus: http://EC2_IP:9090
# 5. Take screenshots (see docs/SCREENSHOTS.md)
# 6. Destroy (CRITICAL!)
./scripts/destroy.shTest without AWS:
cd docker
docker-compose up -d
# Access: localhost:3000 (Grafana), localhost:9090 (Prometheus)
docker-compose down- Set billing alerts BEFORE deploying
- Destroy within 2 hours
- Verify destruction in AWS Console
- Check costs after 24 hours
This supports my technical article: Complete Data Observability: 360-degree Monitoring for Data Platforms
MIT - See LICENSE
Built for the data engineering community