Senior Data Engineer | Data Lakehouse Architect | Enterprise Data Strategy | Fraud & Risk Engineering
I am a Senior Data Platform Engineer and Architect driven by complex engineering challenges. Currently, at BrightSource, building IoT-driven Lakehouses that optimize renewable energy power consumption and strategies.
My background spans from Quantitative Risk at top-tier banks (JPMorgan Chase, Citi) to real-time industrial telemetry. I specialize in the Modern Data Stackβbuilding AWS Data Lakehouses with Apache Iceberg and Flink to unify streaming and batch workflows.
I don't just move data; I engineer robust platforms using software engineering rigor (CI/CD, Infrastructure as Code, Automated Testing) to ensure that pipelines are as reliable as the application code they support.
- π I'm currently architecting real-time streaming platforms with Flink and Iceberg table formats on AWS.
- π± I'm deep into DataOps, Lakehouse patterns, and building Data as a Product pipelines.
- π¬ Ask me about Apache Iceberg, Flink, AWS, Spark, Airflow, Python (OOP), or defeating fraud at scale.
- π« Let's connect: [email protected]
My approach is built on engineering rigor and architectural excellence. Here's how I deliver production-grade systems:
| Principle | Description | Key Technologies |
|---|---|---|
| ποΈ Modern Lakehouse Architecture | Architecting platforms with separated compute and storage using open table formats like Iceberg. Building scalable data products that serve entire organizations with ACID guarantees and time travel. | Apache Iceberg Delta Lake S3 AWS Glue Databricks |
| βοΈ DataOps & CI/CD | Engineering automated pipelines with rigorous testing, version control, and continuous deployment. Every commit triggers validation; every deployment is reproducible and auditable. | GitHub Actions Jenkins Terraform Docker pytest dbt |
| π Streaming & Batch Processing | Deploying real-time event processing and batch orchestration at scale. From Kafka ingestion to Flink transformations to Spark aggregations. | Apache Flink Apache Spark Kafka Airflow PySpark |
| π‘οΈ Resilient Systems Design | Building fault-tolerant architectures with monitoring, alerting, and observability. Designing for failure, testing for chaos, optimizing for recovery. | Prometheus Grafana CloudWatch Data Quality Checks |
Technologies I use daily to build high-performance distributed data systems.
| Category | Technologies |
|---|---|
| Cloud & Infrastructure | |
| Data Lakehouse & Storage | |
| Streaming Processing | |
| Batch Processing & Orchestration | |
| Languages & Core Skills | |
| CI/CD & DevOps | |
| Data Warehousing |
Projects showcasing distributed systems engineering, lakehouse architecture, and real-time data platforms.
| Project Name | Description | Technologies Used |
|---|---|---|
| Data Science Analytical Handbook | Comprehensive technical interview guide covering data engineering patterns, system design, and analytical problem-solving. Deployed as a live handbook with 43+ stars. | Python, Markdown, GitHub Pages, Data Modeling |
| Economic Real-Time Analytics Platform | Architected a real-time economic data pipeline with automated workflows, streaming ingestion, and interactive dashboards for market analysis. | Streamlit, Python, APIs, GitHub Actions, Data Visualization |
| Economic Dashboard API Service | Built a production-grade REST API backend for serving economic data at scale with automated deployment and monitoring. | Python, FastAPI, Docker, CI/CD |
| Practice Questions Platform | Engineered an interactive coding platform for data engineering and analytical problem-solving with automated test suites. | Python, OOP Design Patterns, Testing Frameworks |
| AI Omniscient Architect | Developed an AI-powered system architecture tool leveraging LLMs for intelligent code analysis and architectural recommendations. | Python, AI/ML, System Design, Automation |
| Databricks Solution Architect Handbook | Technical documentation and patterns for architecting lakehouse solutions on Databricks with Iceberg and Delta Lake. | Databricks, Apache Iceberg, Delta Lake, Jupyter Notebooks |
β‘οΈ View All My Repositories
I'm always open to discussing distributed systems, lakehouse architecture, or opportunities in the data engineering space.






