feat(examples): Add customer churn prediction ML example#561
feat(examples): Add customer churn prediction ML example#561Drowser2430 wants to merge 2 commits intopromptdriven:mainfrom
Conversation
feat(examples): Add customer churn prediction ML example
feat(examples): Add customer churn prediction ML example
There was a problem hiding this comment.
Pull request overview
Adds a new ML/data-science example demonstrating Prompt-Driven Development (PDD) end-to-end for customer churn prediction using a scikit-learn LogisticRegression pipeline.
Changes:
- Introduces a churn prediction module (
train+predict) plus a runnable demo script. - Adds a pytest-based unit test suite for the example.
- Adds accompanying PDD prompt and updates examples documentation.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
pdd-contribution-Drowser2430.zip |
Adds a zipped contribution bundle (currently includes build/test artifacts and duplicates). |
examples/customer_churn.py |
New churn training/prediction module using sklearn Pipeline + ColumnTransformer. |
examples/example_customer_churn.py |
New runnable demo generating synthetic data and printing evaluation + predictions. |
examples/test_customer_churn.py |
New pytest suite validating train/predict behavior and edge cases. |
examples/customer_churn_python.prompt |
New PDD prompt describing the churn module requirements. |
examples/README.md |
Replaces the examples index with churn-specific documentation (needs restructuring). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | PDD Concept | Implementation | | ||
| |---|---| | ||
| | Prompt as source of truth | `prompts/customer_churn_python.prompt` | | ||
| | Code generated from prompt | `customer_churn.py` | | ||
| | Usage example | `example_customer_churn.py` | |
There was a problem hiding this comment.
This README points to prompts/customer_churn_python.prompt, but the prompt file added by this PR is examples/customer_churn_python.prompt (no examples/prompts/ directory). Please fix the documented path (or move the prompt file) so the README reflects the actual layout.
| import pytest | ||
| import pandas as pd | ||
| import numpy as np | ||
| from customer_churn import train, predict |
There was a problem hiding this comment.
Tests import from customer_churn import train, predict, which depends on how pytest is invoked and the working directory/PYTHONPATH. Given the intended layout under examples/customer_churn/, please ensure the test import path matches the final structure so pytest can be run as documented (and without relying on implicit cwd behavior).
| from customer_churn import train, predict | |
| from examples.customer_churn import train, predict |
| examples/customer_churn/ | ||
| ├── prompts/ | ||
| │ └── customer_churn_python.prompt # PDD prompt (source of truth) | ||
| ├── customer_churn.py # Generated module | ||
| ├── example_customer_churn.py # Runnable demo |
There was a problem hiding this comment.
The documented file tree assumes an examples/customer_churn/ folder, but this PR currently adds the churn files directly under examples/. Please either move the files into the documented directory structure or update the tree and commands accordingly.
| Create a Python module that trains a binary classification model to predict | ||
| customer churn. The module should: | ||
|
|
||
| 1. Accept a dataset (as a pandas DataFrame or CSV path) with customer features |
There was a problem hiding this comment.
The prompt says the module should accept a dataset as a DataFrame or CSV path, but train() in this PR only accepts pd.DataFrame. Since the prompt is treated as source-of-truth, either update this line or implement CSV-path support.
| 1. Accept a dataset (as a pandas DataFrame or CSV path) with customer features | |
| 1. Accept a dataset as a pandas DataFrame with customer features |
| # Customer Churn Prediction — PDD Example | ||
|
|
||
| This directory contains examples that demonstrate comparisons between using Cursor and Prompt-Driven Development (PDD) for various programming tasks. These examples serve as practical illustrations of how PDD can be used to generate and modify code, via the pdd sync command, and how it compares to traditional development approaches. | ||
| This example demonstrates a complete **Prompt-Driven Development** workflow for a real-world machine learning use case: **predicting customer churn** using logistic regression. | ||
|
|
||
| ## Getting Started | ||
| It is a companion to the core `hello` and `factorial_calculator` examples, showing PDD applied to a **data science / ML context** — a domain not previously covered in the official examples. |
There was a problem hiding this comment.
examples/README.md has been replaced with churn-specific documentation, which removes the overview/index for all other example projects under examples/. Please restore the examples index README and move the churn docs into a dedicated examples/customer_churn/README.md (then link to it from the main examples README).
| """ | ||
| Customer Churn Prediction Module | ||
| Generated via PDD (Prompt-Driven Development) workflow. | ||
| Prompt: prompts/customer_churn_python.prompt |
There was a problem hiding this comment.
The module docstring says Prompt: prompts/customer_churn_python.prompt, but the prompt file added in this PR is examples/customer_churn_python.prompt (and there is no examples/prompts/ folder). Update the reference so the source-of-truth prompt path is correct after the final directory layout is decided.
| Prompt: prompts/customer_churn_python.prompt | |
| Prompt: examples/customer_churn_python.prompt |
|
|
||
| categorical_transformer = Pipeline(steps=[ | ||
| ("imputer", SimpleImputer(strategy="most_frequent")), | ||
| ("onehot", OneHotEncoder(handle_unknown="ignore", sparse_output=False)) |
There was a problem hiding this comment.
OneHotEncoder(..., sparse_output=False) requires scikit-learn >= 1.2; the README currently installs scikit-learn without a minimum version. Either document the minimum required scikit-learn version for this example or use an encoder argument compatible with older versions to avoid runtime failures for users.
| ("onehot", OneHotEncoder(handle_unknown="ignore", sparse_output=False)) | |
| ("onehot", OneHotEncoder(handle_unknown="ignore", sparse=False)) |
| from customer_churn import train, predict | ||
|
|
||
|
|
There was a problem hiding this comment.
This example script imports from customer_churn import train, predict, which only works if customer_churn.py is on the Python path (e.g., running from the same directory). This conflicts with the README’s cd examples/customer_churn instructions (directory doesn’t exist in this PR). Please align the import with the final folder structure (e.g., move files under examples/customer_churn/ and keep relative execution consistent, or adjust the import/package layout accordingly).
| from customer_churn import train, predict | |
| import sys | |
| from pathlib import Path | |
| try: | |
| from customer_churn import train, predict | |
| except ImportError: | |
| # Allow running this example from the `examples/` directory by | |
| # adding the repository root (parent of `examples/`) to sys.path. | |
| repo_root = Path(__file__).resolve().parents[1] | |
| if str(repo_root) not in sys.path: | |
| sys.path.insert(0, str(repo_root)) | |
| from customer_churn import train, predict |
Adds a complete customer churn prediction ML example using sklearn LogisticRegression.
Files added:
Note: Files should be organized under examples/customer_churn/ — happy to restructure if needed.
This adds PDD's first data science/ML example, demonstrating the full PDD workflow on a real-world use case. Related to my application for the AI Engineer role.