data-engineering-portfolio

A collection of data engineering projects showcasing my skills in building robust, scalable, and secure data pipelines using modern tools and cloud platforms.

This portfolio demonstrates experience across the full data engineering lifecycle — from data ingestion and transformation to orchestration, governance, and monitoring.

Key Technologies:

Data Lake: Google Cloud Storage
Data Warehouse: Google BigQuery
IaC: Terraform
Data Transformation: dbt, PySpark
Workflow orchestration: Kestra, Airflow
Containerization: Docker

Projects

Github activities visualization

description

ETL pipeline orchestrated with Kestra that extracts .json files data from GH Archive, load them to a Google Cloud Bucket, transforms them into Tables in Google BigQuery via dbt. Data are visualized via Looker Studio

tech stack

Docker (containerization)
Terraform (infrastructure as code)
Kestra (workflow orchestration)
Google Cloud Storage (data lake)
BigQuery (data warehouse)
dbt (data transformation)
Looker Studio (data visualization)

Crypto Timeseries

description

Simple ETL pipeline orchestrated with AirFlow. Reads data from from CoinCap api and loads data into a postresql database using a PySpark cluster.

tech stack

AirFlow (workflow orchestration)
PySpark (data transformation)
Postgresql (database)

Single Guide RNA pipeline

description

This repository contains a Nextflow pipeline that takes a file of sgRNA sequences and analyzes where they align in the human genome (GRCh38). It includes steps to convert and compare gene information, and creates a simple gene expression matrix using data from two breast cancer (TCGA-BRCA) samples.

tech stack

Nextflow (workflow orchestration)
Docker (containerization)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

data-engineering-portfolio

Projects

Github activities visualization

description

tech stack

Crypto Timeseries

description

tech stack

Single Guide RNA pipeline

description

tech stack

About

Uh oh!

Releases

Packages

LolloPero/data-engineering-portfolio

Folders and files

Latest commit

History

Repository files navigation

data-engineering-portfolio

Projects

Github activities visualization

description

tech stack

Crypto Timeseries

description

tech stack

Single Guide RNA pipeline

description

tech stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages