🌀 Churn Model Evaluation Platform

Problem Statement
Key Features
Customer Churn Data Source
Platform Processes
- Model Development and Deployment
- Model Inference, Reporting, and Evaluation
Platform Infrastructure Diagram
S3 File Drop Folder Structure
Project Folders & Files
- Full Project Folder Tree
Security Limitations & Future Improvements
Installation Prerequisites
Docker Local Image Storage Space Requirements
Library Dependencies & Version Numbers
How to Install Platform
Platform ECS Services
- MLflow Tracking Server & Model Registry
- Optuna Hyperparameter Tuning Dashboard
- Prefect Orchestration Server and Worker Service
- Evidently Non-Time-Series Dashboard and Reports UI
- Grafana Time-Series Dashboard UI
How to Upload Data
How to Evaluate Data Drift & Model
Data Drift & Prediction Score Email Alerts
Unit Test Examples
Integration Test Examples
Pre-Commit Hooks
- How to Activate Pre-Commit Hooks
- Hooks List
Makefile Targets
CI-CD Implementation
DataTalksClub MLOps Zoomcamp Evaluation Criteria

Problem Statement

Companies rely on churn prediction models to proactively retain valuable customers--an effort that is typically more cost-effective than acquiring new ones.
However, once deployed, these models risk losing accuracy over time as customer behavior and demographics shift.
This project addresses this challenge by providing Data Scientists, Machine Learning Engineers, and their stakeholders a platform to continuously train, deploy, and monitor churn models, enabling organizations to detect drift, maintain model quality, and adapt to changing customer dynamics.

Key Features

Area	Features
🧠 Model Development	Jupyter Notebook ↗ provides a data scientist-friendly environment for exploratory data analysis and feature engineering. Training logic extracted into a reusable module, ensuring consistency between training and inference. Fast experimentation enabled via Optuna ↗ for Bayesian optimization and MLflow ↗ for experiment tracking. Developer-friendly codebase includes unit/integration tests, GitHub Actions-based CI/CD ↗, and pre-commit hooks for linting and formatting.
📊 Model Evaluation	Evaluates model on both training and holdout datasets to establish baseline bias and variance for future improvement. MLflow UI enables deep comparison of experiment runs, visualizations (confusion matricies, precision-recall curves), and SHAP explanatory plots.
🚀 Model Deployment	Each model version is packaged in the MLflow Model Registry ↗ with metadata, dependencies, and signatures for reproducibility. Alias-based promotion supports decoupling development from deployment. Deployment to AWS ECS ↗ via Prefect ↗ provides scalable and observable on-demand inference.
📈 Model Monitoring	Evidently.ai ↗ generates automated reports for data drift and prediction performance after each inference run. Pre-built Grafana ↗ dashboard visualizes model metrics over time, helping distinguish anomalies from signals indicating the need for model development.
🔁 Model Maintenance	Email alerts triggered if data drift exceeds threshold or model prediction scores (F1, precision, recall, accuracy) fall below acceptable limits. Enables manual model retraining and defines a framework extensible to automated retraining pipelines.
🤝 Team Collaboration	Provides visibility into all model lifecycle stages through MLOps tool UIs (e.g., MLflow, Prefect, Evidently). Stores activity metadata in PostgreSQL databases for advanced querying and auditability. Enhances transparency and cross-functional collaboration, accelerating model iteration and stakeholder alignment.

Customer Churn Data Source

The labeled customer churn data used to train the model was randomly collected from an Iranian telecom company on 4/8/2020 and made available for download by the UC Irvine Machine Learning Repository ↗.
This repository contains a data/ folder with several CSV files prefixed with customer_churn_*.
The following files were split from the original dataset:
- customer_churn_0.csv (used as training set)
- customer_churn_1.csv
- customer_churn_2.csv
Additional testing was performed using customer_churn_synthetic_*.csv files, which were generated from the original dataset using Gretel.ai ↗.

Platform Processes

Two independent processes were enabled by this project, joined by the MLflow Model Registry:

Model Development and Deployment

The following process is implemented with two files inside the code/orchestration/modeling/ folder:

Jupyter Notebook: churn_model_training.ipynb
- Main location for EDA, hyperparameter tuning, and calling upon the modeling.churn_model_training module below
Python Module: churn_model_training.py
- Contains functions used in both Jupyter Notebook and in Prefect Flow churn_prediction_pipeline.py (see Model Inference, Reporting, and Evaluation section below)

flowchart TD
    A[Download training data]
    B[Prepare data]
    C[Tune hyperparameters]
    D[Narrow parameter search space using Optuna UI]
    E[Train model]
    F[Evaluate model on training set using MLflow UI]
    G[Evaluate model on holdout set using MLflow UI]
    H[Is model performance sufficient?]
    I[Promote model in MLflow Registry]
    A --> B --> C --> E --> F --> G --> H;
    C --> D --> C;
    H --> |Yes| I;
    H --> |No| B;

Model Inference, Reporting, and Evaluation

The following process is orchestrated by the Prefect Flow code/orchestration/churn_prediction_pipeline.py.
A new flow run is created for each file dropped into the S3 File Drop Folder input folder (see S3 File Drop Folder Structure).

flowchart TD
    A[Drop new customer<br/>churn data into S3]
    B[Load latest promoted model in MLflow Registry]
    C[Validate file input]
    D[Prepare data, reusing training logic]
    E[Generate predictions]
    F[Append predictions to file input]
    G[Generate Evidently data drift and prediction performance report]
    H[Save report to database]
    I[Did drift exceed threshold?]
    J[Send drift email alert]
    K[Did prediction performance drop below threshold?]
    L[Send prediction score email alert]
    M[Evaluate detailed drift and performance report in Evidently UI]
    N[Evaluate drift and performance over time in Grafana UI]
    A-->B-->C-->D-->E-->F-->G-->H-->I-->K-->M-->N;
    I--> |Yes| J;
    K--> |Yes| L;

Platform Infrastructure Diagram

(click to enlarge)

S3 File Drop Folder Structure

s3://your_project_id
└── data
    ├── input        # Customer churn files uploaded here
    ├── processing   # Files moved here during processing
    ├── logs         # Log file created for each dropped file
    ├── processed    # Files moved here on successful processing
    └── errored      # Files moved here if error occurred during processing

Project Folders & Files

This project consists mainly of the following folders and files:

Folder/File	Purpose
`code/grafana/`	Contains Dockerfile that packages Grafana Enterprise image with: `grafana-postgres-datasource.yml`: Postgres Data Source configuration `churn-model-evaluation.json`: Pre-created Data Drift and Prediction Score Evaluation Dashboard
`code/orchestration/`	Contains Dockerfile that packages Prefect Flow pipeline consisting of: `churn_prediction_pipeline.py`: Contains main Prefect flow and tasks `modeling/` Contains model training and registry deployment logic: `churn_model_training.ipynb` for continued exploratory data analysis and model development `churn_model_training.py` for extracting training logic to reuse in Prefect pipeline `tests/` `unit/` Contains unit tests using `unittest.MagicMock` to mock all dependencies `integration/` Contains integration tests utilizing `testcontainers.localstack` module ↗ to mock AWS component with LocalStack ↗ equivalent
`code/s3_to_prefect_lambda/`	Contains Dockerfile that packages `lambda_function.py` and its dependencies for notifying Prefect pipeline of new file drops Invoked by S3 Bucket Notification configured by Terraform `s3-to-prefect-lambda` module
`data/`	These files were split from the original data set: `customer_churn_0.csv`: File used to train the model `customer_churn_1.csv` `customer_churn_2_majority_drifted.csv`: File that exhibits data drift exceeding threshold (email notification will be sent) `customer_churn_synthetic_*.csv`: Generated using Gretel.ai ↗
`infrastructure/`	Contains Infrastructure-as-Code (IaC) and scripts to configure or destroy 80+ AWS resources from single command `modules/`: Configures the following AWS Services: `alb/`: AWS Application Load Balancer ↗ `ecr/`: AWS Elastic Container Registry ↗ `ecs/`: AWS Elastic Container Service ↗ `rds-postgres/`: AWS Relational Database Service ↗ `s3/`: AWS Simple Storage Service ↗ `s3-to-prefect-lambda/`: AWS Lambda ↗ `sns/`: AWS Simple Notification Service ↗ `scripts/` `set-project-id.sh`: Adds the `project_id` you configured in `stg.tfvars` to the `.env` environment variable file. `store_prefect_secrets.py`: Stores generated ALB endpoints and SNS topic ARN into Prefect Server for use by pipeline `wait-for-services.sh`: Used to wait for Platform ECS Services to become available via ALB before returning UI URLs to user `vars/` `stg.tfvars.template`: Base file for creating your own `stg.tfvars` (see How to Install Platform)
`readme-assets/`	Screenshots for this readme
`.env` (generated)	Contains the following environment variables: Optuna DB Connection URL MLflow, Optuna, Prefect, Evidently, and Grafana UI URLs Prefect Server API URL Your Project ID
`.pre-commit-config.yaml` (generated)	Configures Pre-Commit hooks that execute prior to every commit (see Pre-Commit Hooks section)
`Makefile`	Contains several targets to accelerate platform setup, development, and testing (see Makefile Targets section)
`upload_simulation_script.py`	Script that helps generate metrics over time for viewing in Grafana UI Uploads the non-training data files into S3 File Drop Input folder 30 seconds apart

Full Project Folder Tree

The full project folder tree contents can be viewed here.

Security Limitations & Future Improvements

Security Limitation	Future Improvement
The IAM policy used is intentionally broad to reduce setup complexity.	Replace with least-privilege policies tailored to each service role.
Public subnets are required to simplify RDS access from ECS and local machines.	Migrate to private subnets with NAT Gateway and use bastion or VPN access for local clients.
The Prefect API ALB endpoint is publicly accessible to enable GitHub Actions deployment.	Restrict access to GitHub Actions IP ranges using ingress rules or CloudFront.
The MLflow ALB endpoint is publicly accessible to allow ECS Workers to reach the Model Registry.	Limit access to internal ECS security groups only.
The Prefect API ALB endpoint is visible in cleartext as an environment variable in the `.github/workflows/deploy-prefect.yml` file. This may pose a security risk if your GitHub repo is publicly visible.	Consider migrating this variable to a GitHub Repository secret and automatically upserting this value as a new Terraform action post-`apply`.

Installation Prerequisites

Python 3.10.x
AWS Account ↗
- AWS Account required to deploy the pipeline to the cloud and run it as a user
- AWS Account NOT required to run unit and integration tests
AWS User with the Required IAM Permissions policies
AWS CLI ↗ installed with aws configure run to store AWS credentials locally
Docker ↗ installed and Docker Engine is running
Pip ↗ and Pipenv ↗
Terraform ↗
Prefect ↗
Pre-commit ↗
GitHub Account
- At this time, committing repo to your GitHub account and running GitHub Actions workflow is the only way to deploy Prefect flow to Prefect Server (without manual effort to circumvent)

Required IAM Permissions

A user with the following AWS Managed Permissions policies was used when creating this Platform. Please note that this list is overly-permissive and may be updated in the future.

AmazonEC2ContainerRegistryFullAccess
AmazonEC2FullAccess
AmazonECS_FullAccess
AmazonRDSFullAccess
AmazonS3FullAccess
AmazonSNSFullAccess
AmazonLambda_FullAccess
CloudWatchLogsFullAccess
IAMFullAccess

Docker Local Image Storage Space Requirements

The Docker images required for the following components occupy approximately 5.4 GB of local disk space:

Custom Grafana Bundle
- Packages database configuration and dashboard files with Grafana Enterprise
- Uses Grafana grafana/grafana-enterprise:12.0.2-security-01 image
S3-to-Prefect Lambda Function
- Invokes orchestration flow when new files are dropped into S3
- Uses AWS public.ecr.aws/lambda/python:3.12 image
Testcontainers + LocalStack
- Used by integration tests to mock AWS S3 service using LocalStack
- Uses LocalStack localstack/localstack:4.7.0 image

After deployment, remove these local Docker images to conserve space.

Library Dependencies & Version Numbers

See the Pipfile and Pipfile.lock files within the following folders for the full lists of library dependencies and version numbers used:

code/orchestration/
code/s3_to_prefect_lambda/

How to Install Platform

Install the prerequisites
Ensure your Docker Engine is running
Create an S3 bucket to store the state of your Terraform infrastructure (e.g. churn-platform-tf-state-<some random number>)
Clone churn-model-evaluation-platform repository locally
Edit root Terraform configuration to store state within S3
1. Edit file: {REPO_DIR}/infrastructure/main.tf
2. Change terraform.backend.s3.bucket to the name of the bucket you created
3. Change terraform.backend.s3.region to your AWS region
Copy Terraform infrastructure/vars/stg.template.tfvars file to new infrastructure/vars/stg.tfvars file and define values for each key within:

Key Name	Purpose	Example Value
`project_id`	Used as prefix for many AWS resources, including the S3 bucket where files will be dropped and generated. Must be a valid S3 name (e.g. unique, no underscores). Must be 20 characters or less to prevent exceeding resource naming character limits.	`mlops-churn-pipeline`
`vpc_id`	Your AWS VPC ID	`vpc-0a1b2c3d4e5f6g7h8`
`aws_region`	Your AWS Region	`us-east-2`
`db_username`	Username for Postgres database used to store MLflow, Prefect, and Evidently Metrics. Must conform to Postgres rules (e.g. lowercase, numbers, underscores only)	`my_super_secure_db_name`
`db_password`	Password for Postgres database. Use best practices and avoid spaces.	`Th1s1sAStr0ng#Pwd!`
`grafana_admin_user`	Username for Grafana account used to edit data drift and model prediction scores over time.	`grafana_FTW`
`grafana_admin_password`	Password for Grafana account	`Grafana4Lyfe!123`
`subnet_ids`	AWS Subnet IDs: *Must be public* subnet IDs from different Availability Zones to allow Postgres RDS instance to be accessed by ECS services (and optionally your IP address)**	`["subnet-123abc456def78901", "subnet-234bcd567efg89012"]`
`my_ip`	IP address that will be granted access to Grafana UI, Optuna UI and Postgres DB	`203.0.113.42`
`my_email_address`	Email address that will be notified if majority of inferenced data columns exhibit data drift or prediction scores fall below threshold	`your.name@example.com`

cd {REPO_DIR}/infrastructure then terraform init. If successful, this command will populate the Terraform State S3 bucket you created in Step 2 with the necessary files to capture the state of your infrastructure across Terraform command invocations.
cd {REPO_DIR}/code/orchestration then pipenv shell
cd {REPO_DIR}
Run make plan and review the infrastructure to be created (see Platform Infrastructure Diagram
Run make apply to build Terraform infrastructure, set Prefect Secrets, update GitHub Actions workflow YAML, and start ECS services.
1. After Terraform completes instantiating each ECS Service, it will execute the wait_for_services.sh script to poll the ALB URLs until each service instantiates its ECS Task and the service is ready for use.
2. For the user's convenience, each tool's URL is displayed to the user once ready for use (see Platform ECS Services).
Click each of the 5 ECS Service URLs to confirm they are running: MLflow, Optuna, Prefect Server, Evidently, Grafana
Run make deploy-model to train a XGBoostChurnModel churn model and upload it to the MLflow Model Registry with staging alias.
1. Confirm it was created and aliased by visiting the Model Registry within the MLflow UI
2. Note the following:
  1. Two versions of the model are visible in the registry evaluated using training and holdout datasets (X_train and X_test, respectively)
  2. The data used to train the staging model was logged as the artifact reference_data.csv in its experiment run
Deploy the churn_prediction_pipeline Prefect Flow to your Prefect Server using GitHub Actions
1. Commit your cloned repo (including {REPO_DIR}/.github/workflows/deploy-prefect.yml updated with generated PREFECT_API_URL)
2. Log in to your GitHub account, navigate to your committed repo project and create the following Repository Secrets ↗ (used by deploy-prefect.yml):
  1. AWS_ACCOUNT_ID
  2. AWS_ACCESS_KEY_ID
  3. AWS_SECRET_ACCESS_KEY
  4. AWS_REGION
3. Navigate to GitHub Project Actions tab, select the workflow Build and Deploy Prefect Flow to ECR, and click the green "Run workflow" button to deploy the Prefect flow
  1. Confirm it was deployed sucessfully by visiting the "Deployments" section of the Prefect UI
Confirm your email subscription to the pipeline SNS topic
1. Navigate to the inbox of the email address you configured in stg.tfvars and look for an email subject titled AWS Notification - Subscription Confirmation.
2. Open the email and click the Confirm Subscription link within.
3. Verify you see a green message relaying your subscription has been confirmed.

Platform ECS Services

Once the Terraform make apply command completes successfully, you should see output similar to the following that provides you URLs to each of the created tools:

🎉 All systems go! 🎉

MLflow, Optuna, Prefect, Evidently, and Grafana UI URLs
-------------------------------------------------------

🧪 MLflow UI: http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:5000
🔍 Optuna UI: http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:8080
⚙️ Prefect UI: http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:4200
📈 Evidently UI: http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:8000
📈 Grafana UI: http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:3000

Clicking on each URL should render each tool's UI successfully in your browser (the Terraform command includes invoking a script that polls the services' URLs until they return successful responses).

If any of the URLs return an error (e.g. 503 Service Unavailable), investigate the root cause by logging into the AWS Elastic Container Service (ECS) console and inspecting the logs of the ECS Task that is failing.

If all the services started successfully, your ECS Task list should look similar to this screenshot:

These URLs were also written to the {REPO_DIR}/.env file for future retrieval and export to shell environment when needed.

OPTUNA_DB_CONN_URL=postgresql+psycopg2://USERNAME:PASSWORD@your-project-id-postgres.abcdefghijk.us-east-2.rds.amazonaws.com:5432/optuna_db
MLFLOW_TRACKING_URI=http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:5000
PREFECT_API_URL=http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:4200/api
EVIDENTLY_UI_URL=http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:8000
PREFECT_UI_URL=http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:4200
GRAFANA_UI_URL=http://your-project-id-alb-123456789.us-east-2.elb.amazonaws.com:3000

The following sections give a brief overview of the tool features made available in this project:

MLflow Tracking Server & Model Registry

Lists model experiment runs that track model metrics and parameters used
Captures details of each experiment run, including model type and training dataset used
Automatically creates images to aid evaluation (e.g. confusion matrix, SHAP summary plot)
Stores models in Model Registry for future use (e.g. loaded by Model Evaluation Pipeline on file drop)

Optuna Hyperparameter Tuning Dashboard

Gain insight on Optuna hyperparameter tuning trials to narrow parameter search spaces and more quickly find optimal parameters.

Prefect Orchestration Server and Worker Service

View completed, running, and failed model evaluation runs to monitor pipeline health and address any unexpected issues.

Evidently Non-Time-Series Dashboard and Reports UI

Assess dataset drift and model performance for each new churn data drop to decide whether model retraining is needed.

Grafana Time-Series Dashboard UI

Provides a pre-created dashboard plotting model data drift and performance metrics over time to distinguish anomalies from signals suggesting model development is needed.

How to Upload Data

Navigate to {REPO_DIR} (and run cd code/orchestration && pipenv shell if you haven't already)
You can process the labeled Customer Churn data in one of two ways:
1. Run make simulate-file-drops from {REPO_DIR} to run the script upload_simulation_script.py which uploads each file in the data folder (except customer_churn_0.csv) to the S3 bucket folder
2. Manually upload files from the {REPO_DIR}/data folder into the S3 bucket {PROJECT_ID}/data/input folder

How to Evaluate Data Drift & Model

Once your files have completed processing (as visible via Prefect UI or seeing them appear in S3 data/processed/ folder), you can evaluate their data in two ways:

Navigate to the Evidently UI to view detailed data drift metrics and prediction scores for each file
Navigate to the Grafana UI and view the pre-built "Customer Churn Model Evaluation" dashboard to view how the drift metrics and prediction scores have behaved over time

Data Drift & Prediction Score Email Alerts

The pipeline will send an email to the address configured within stg.tfvars in each of the following scenarios:

Data Drift Alert

Sent if Evidently finds more than 50% of the new customer data set columns have drifted from the reference data set:

Prediction Score Alert

Sent if Evidently reports any of the observed prediction scores drop below 70%:

F1 Score
Precision
Recall
Accuracy

Unit Test Examples

Example unit tests can be found within the code/orchestration/tests/unit/ folder for select Prefect @task functions of churn_prediction_pipeline.py.

The unittest.TestCase, unittest.mock.Patch, and unittest.mock.MagicMock classes were used to create reused test fixture code that overrode ("patch"-ed) class object references with mock objects.

├── code
│   ├── orchestration
│   │   ├── tests
│   │   │   ├── unit
│   │   │   │   ├── test_fetch_model.py
│   │   │   │   ├── test_generate_predictions.py
│   │   │   │   ├── test_move_to_folder.py
│   │   │   │   ├── test_prepare_dataset.py
│   │   │   │   └── test_validate_file_input.py
|   |   └── churn_prediction_pipeline.py

Integration Test Examples

Example integration tests can be found within the code/orchestration/tests/integration/ folder for the validate_file_input @task function of churn_prediction_pipeline.py.

In order to integration test the function is correctly reading files from S3, the testcontainers.localstack module ↗ was used to dynamically create a LocalStack ↗ container that served as a mock S3 endpoint for the s3_client calls made by the validate_file_input function.

├── code
│   ├── orchestration
│   │   ├── tests
│   │   │   ├── integration
│   │   │   │   └── test_validate_file_input.py
|   |   └── churn_prediction_pipeline.py

Pre-Commit Hooks

How to Activate Pre-Commit Hooks

The following steps are required to activate pre-commit hooks for this repository:

Navigate to {REPO_DIR}/code/orchestration/ and run pipenv shell if you haven't already
Navigate to {REPO_DIR} and run pre-commit install
Ensure your Docker Engine is running (needed for LocalStack-based integration tests)
Run make quality to generate the required .pre-commit-config.yaml file and execute the hooks

Generating .pre-commit-config.yaml was needed to inject the absolute path to the code/orchestration/modeling module folder for pylint (future improvement: use relative path instead). For this reason, .pre-commit-config.yaml is included in .gitignore to not commit cleartext absolute path in case you commit your repo publicly.

Hooks List

The following hooks are used to maintain notebook and module code quality and execute tests prior to commiting files to Git:

nbqa-pylint
nbqa-flake8
nbqa-black
nbqa-isort
trailing-whitespace
end-of-file-fixer
check-yaml
check-added-large-files
isort
black
pylint
pytest-check

Makefile Targets

The following table lists the make targets available to accelerate platform deployment, development, and testing:

Target Name	Purpose
`test`	Runs all unit and integration tests defined within `code/orchestration` and `code/s3_to_prefect_lambda` folders
`quality`	Runs `pre-commit run --all-files`. See Pre-Commit Hooks
`commit`	Stages all changed files, prompts user for commit message, and attempts to commit the files (barring pre-commit errors)
`plan`	Runs `terraform plan --var-file=vars/stg.tfvars` from `infrastructure` directory
`apply`	Runs `terraform apply --var-file=vars/stg.tfvars --auto-approve` and outputs emoji-filled message with UI URLs upon successful deploy and ECS Task activation
`destroy`	Runs `terraform destroy -var-file=vars/stg.tfvars --auto-approve`
`disable-lambda`	Used to facilitate local dev/testing: Disables notification of the `s3_to_prefect` Lambda function so files aren't automatically picked up by the deployed service. Lets you drop file(s) manually in S3 and run the pipeline locally when you're ready (see `process-test-data` target below).
`enable-lambda`	Re-enables the `s3_to_prefect` Lambda notification to resume creating new Prefect flow runs on S3 file drop
`deploy-model`	Executes the `churn_model_training.py` file to train and deploy two models to the MLflow Registry (evaluated on training and holdout data, respectively). The second model is assigned the `staging` alias to allow the Prefect pipeline to fetch the latest `staging` model without code changes. Note: Hyperparameter tuning is NOT performed with this target to accelerate model deployment. See `log-model-nopromote` target if hyperparameter tuning is desired.
`log-model-nopromote`	Executes the `churn_model_training.py` file to train and deploy two models to MLflow without executing promotion steps (e.g. does not apply `staging` alias). Used to develop and optimize model performance prior to making it available for stakeholder use. Hyperparameter Tuning Notes: Optuna Bayesian hyperparameter tuning is performed with this target, with the number of trials currently set to 50. Modify churn_model_training.py to adjust the number of trials if desired. Optuna recommends executing at least 20-30 trials to accumulate enough prior runs to optimize parameters. Allot approximately 10 minutes for the trials to complete (actual time varies based on local machine performance and network connectivity). Suggested optimization approach: Use the Optuna UI to correlate parameter ranges with higher F1 scores Use insights to narrow the parameter search space used in `churn_model_training.py` Once you've chosen the optimal parameters, modify the `best_params_to_date` variable in `churn_model_training.py` so that they are used in subsequent executions of the `deploy-model` target
`process-test-data`	Use to manually invoke flow after running `disable-lambda` target. Upload `customer_churn_1.csv` into the S3 `data/input/` folder before use. Runs command `python churn_prediction_pipeline.py your-project-id data/input/customer_churn_1.csv` and instantiates ephemeral local Prefect Server to execute flow.
`simulate-file-drops`	Runs `upload_simulation_script.py` to automatically upload each non-training data file in the `data/` folder to the S3 File Drop input folder.

CI-CD Implementation

GitHub Actions ↗ was used to execute the following Continuous Integration and Continuous Delivery (CI/CD) process.
See .github/workflows/deploy-prefect.yml for details.

flowchart TD
    A[Commit changes to code/orchestration/ tree]
    B[Manually run workflow from GitHub UI]
    C[Initialize ubuntu-latest Runner VM]
    D[Checkout code]
    E[Set up Python]
    F[Install Pipenv]
    G[Install code/orchestration/ Pipfile dependencies]
    H[Configure AWS credentials]
    I[Run unit and integration tests]
    J[Log in to Amazon ECR]
    K[Construct Prefect Flow Docker image name & tag]
    L[Inject Docker image name & tag into Prefect Flow YAML]
    M[Display YAML in GitHub Actions log for verification]
    N[Install Prefect]
    O[Build Docker container & deploy to Prefect Server]
    A-->C;
    B-->C;
    C-->D-->E-->F-->G-->H-->I-->J-->K-->L-->M-->N-->O;

DataTalksClub MLOps Zoomcamp Evaluation Criteria

This project earned the highest-tier score (top 10 of 183 participants ↗) in peer-reviewed project assessment.

Source: https://github.com/DataTalksClub/mlops-zoomcamp/tree/main/07-project

Problem description

Target: The problem is well described and it's clear what the problem the project solves

See Problem Statement section.

Cloud

Target: The project is developed on the cloud and IaC tools are used for provisioning the infrastructure

See Project Folders & Files section for summary of Terraform files used to create AWS infrastructure.
See Platform Infrastructure Diagram for diagram of cloud resources created and collaborations for each.

Experiment tracking and model registry

Target: Both experiment tracking and model registry are used

See MLflow Tracking Server & Model Registry section for screenshots of experiments tracked and model stored in registry.

Workflow orchestration

Target: Fully deployed workflow

See Prefect Orchestration Server and Worker Service section for screenshots of fully deployed workflow within Prefect UI and examples of worflow executions ("runs").

Model deployment

Target: The model deployment code is containerized and could be deployed to cloud or special tools for model deployment are used

See the orchestration and s3_to_prefect_lambda folders of Project Folders & Files to see how the model deployment code was containerized and deployed to the cloud.

Model monitoring

Target: Comprehensive model monitoring that sends alerts or runs a conditional workflow (e.g. retraining, generating debugging dashboard, switching to a different model) if the defined metrics threshold is violated

See Data Drift & Prediction Score Email Alerts section for examples of email alerts that are sent when new customer data files exhibit the majority of their columns drifting from reference data or when the model prediction scores drop below pre-defined threshold.

Reproducibility

Target: Instructions are clear, it's easy to run the code, and it works. The versions for all the dependencies are specified.

See How to Install Platform section for instructions on how to set up the platform.
See Library Dependencies & Version Numbers section for instructions on how to determine libraries used and their verision numbers.

Best practices

Target: There are unit tests (1 point)

See Unit Test Examples section for summary of unit tests that were implemented.

Target: There is an integration test (1 point)

See Integration Test Examples section for summary of integration tests that were implemented.

Target: Linter and/or code formatter are used (1 point)

See Hooks List section to see which linter and code formatters were used.

Target: There's a Makefile (1 point)

See Makefile Targets section for list of Makefile targets that were implemented.

Target: There are pre-commit hooks (1 point)

See Pre-Commit Hooks section to see which hooks were used.

Target: There's a CI/CD pipeline (2 points)

See CI-CD Implementation section for summary of how CI/CD was implemented.

Name		Name	Last commit message	Last commit date
Latest commit History 658 Commits
.github/workflows		.github/workflows
code		code
data		data
infrastructure		infrastructure
readme-assets		readme-assets
.gitignore		.gitignore
.pre-commit-config.template.yaml		.pre-commit-config.template.yaml
.python-version		.python-version
Makefile		Makefile
README.md		README.md
folder-structure.txt		folder-structure.txt
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
upload_simulation_script.py		upload_simulation_script.py

Folders and files

Latest commit

History

Repository files navigation