Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
6c3b74c
commit initial project state
May 9, 2025
133b663
update step params and annotations
May 11, 2025
c282b90
update pipeline params
May 11, 2025
b70774b
add unique names to data artifacts, and fetch run id from step context
May 11, 2025
15e6930
fix pipeline protected attributes and target column
May 11, 2025
6850c0c
update compliance dir
May 11, 2025
d4880fb
update yaml configs
May 11, 2025
534ba1b
update constants, run.py and README
May 11, 2025
b73013f
add model definition to utils
May 11, 2025
e4f8607
delete preview md doc
May 11, 2025
b9cc7b3
small fixes for training pipeline
May 11, 2025
831c44b
fetch necessary params for training pipeline
May 12, 2025
078b973
name and store annotated types in constants
May 12, 2025
39e34fe
refactor and add additional artifacts
May 12, 2025
e68ac77
add deploymeny.yaml config
May 12, 2025
0556236
refactor and complete modal deployment template
May 12, 2025
441026c
add app dir for modal deployment
May 12, 2025
2bf328f
remove app.py
May 12, 2025
3f6fe14
delete auto-generated artifacts and refactor
May 12, 2025
2c52e85
refactor: storage consolidation, unified configuration, maximum zenml…
May 13, 2025
b2ee313
cleanup compliance dir and update jinja template
May 13, 2025
0cfc9fc
save approval record to modal, not locally
May 13, 2025
2b7cc53
add diagrams and detailed pipeline explanations
May 13, 2025
dc2d3a9
fix modal deployment: use add local file for constants.py, and add Ap…
May 13, 2025
b1cded9
convert UUID to strings, and update README
May 13, 2025
431c56b
remove modal card from metadata logged to zenml
May 13, 2025
69d09d7
remove unused deps
May 13, 2025
3f408d4
refactor and simplify feature engineering pipeline for demo purposes
May 14, 2025
7d16f4a
replace crypto dataset with better suited credit_scoring dataset and …
May 14, 2025
b35aac9
add qms templates, and releases dir per run for compliance docs
May 16, 2025
cf96b26
update utils and add incidents, preprocess, risk_dashboard streamlit …
May 16, 2025
392632f
refactor and integrate sbom, completed annex iv for demo, and improve…
May 16, 2025
2e25997
Add COMPLIANCE.md file, and assets dir for pipeline diagrams
May 16, 2025
0d9ab10
update diagrams and prettify jinja template
May 16, 2025
684e0e1
organize and refactor
May 19, 2025
1b9c251
update annex iv template and sample inputs
May 19, 2025
85e8d01
save whylogs htmlstring visualization along with release artifacts
May 19, 2025
4ffc4f4
update README
May 19, 2025
6008cdc
Add 'credit-scorer/' from commit '6c3b74c063e60cd456b7ab6168107abc657…
May 19, 2025
ff0c15b
implement PR suggestions and additional dashboard improvements
May 20, 2025
a388c3c
update README and add typehints to template.py
May 20, 2025
396126c
Merge commit 'a388c3cb635e4e5c72a349105430d7d86b932fb1' into import/c…
May 20, 2025
b4b81da
cleanup and trim blank rows from csv and xlsx files
May 20, 2025
84f9c54
fix broken links
May 20, 2025
a708b2c
run format/lint scripts
May 20, 2025
132f2a8
add credit-scorer to zenml-projects README.md
May 20, 2025
88a2f71
remove diagrams.md reference
May 20, 2025
6bb6c92
remove diagrams.md reference from README.md
May 20, 2025
4c0dc06
update streamlit_app to use shared compliance_dashboard html
May 21, 2025
21ce7b0
update utils
May 21, 2025
8f0a125
modify model from GradientBooster to LightGBM and update steps/pipelines
May 21, 2025
c395ea5
update requirements.txt and refactor code
May 21, 2025
5f0b9ec
add generate_dashboard step for compliane coverage html artifact
May 21, 2025
9ed7683
run format/lint scripts
May 21, 2025
eadb783
run format/lint scripts
May 21, 2025
960656a
add docs/releases to typos.toml ignore list
May 21, 2025
f614dcc
implement suggestions from PR review and simplify project
May 22, 2025
f8beeb2
run format/lint scripts
May 22, 2025
b5c8c02
fix broken links
May 22, 2025
9bb2937
improve README clarity, and small update to approve_deployment step
May 22, 2025
3eae010
run format script
May 22, 2025
d4dde3d
fix broken markdown link
May 22, 2025
9099a4f
add slack alerter for approval step
May 22, 2025
b88d784
uncomment code and cleanup approve step
May 22, 2025
e99a509
run format script
May 22, 2025
2b6894b
make slack message neater
May 22, 2025
d091258
remove extra html file
May 22, 2025
f8d2c2d
reorder imports, and add missing utils/visualizations directory
May 23, 2025
2a77a2d
run format script
May 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 18 additions & 6 deletions credit-scorer/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Credit Scoring EU AI Act Demo

Automatically generate complete EU AI Act compliance documentation with minimal manual effort for credit scoring models.
An end‑to‑end credit‑scoring workflow that automatically generates the technical evidence required by the [EU AI Act](https://www.zenml.io/blog/understanding-the-ai-act-february-2025-updates-and-implications).

<div align="center"> <img src="assets/compliance-dashboard.png" alt="Compliance Dashboard" width="800" /> </div>

Expand All @@ -15,7 +15,7 @@ Financial institutions must comply with the EU AI Act for any high‑risk AI sys

## 🔍 Data Overview

This project uses a credit scoring dataset based on the Home Credit Default Risk data. The raw dataset contains potentially sensitive attributes such as `CODE_GENDER`, `DAYS_BIRTH`, `NAME_EDUCATION_TYPE`, `NAME_FAMILY_STATUS`, and `NAME_HOUSING_TYPE`, which can be filtered using the pipeline's `sensitive_attributes` parameter to comply with fairness requirements.
This project leverages the [Home Credit Default Risk dataset provided by the Home Credit Group](https://www.kaggle.com/c/home-credit-default-risk/overview). The raw dataset contains potentially sensitive attributes such as `CODE_GENDER`, `DAYS_BIRTH`, `NAME_EDUCATION_TYPE`, `NAME_FAMILY_STATUS`, and `NAME_HOUSING_TYPE`, which can be filtered using the pipeline's `sensitive_attributes` parameter to comply with fairness requirements.

Key fields used for modeling:

Expand Down Expand Up @@ -46,7 +46,7 @@ The system implements three main pipelines that map directly to EU AI Act requir
| **[Training](src/pipelines/training.py)** | **Train** → LightGBM w/ class‑imbalance handling 🎯<br>**Evaluate** → Accuracy, AUC, fairness analysis ⚖️<br>**Assess** → Risk scoring & model registry 📋 | Arts 9, 11, 15 |
| **[Deployment](src/pipelines/deployment.py)** | **Approve** → Human oversight gate 🙋‍♂️<br>**Deploy** → Modal API deployment 🚀<br>**Monitor** → SBOM + post‑market tracking 📈 | Arts 14, 17, 18 |

Each pipeline run automatically versions all inputs/outputs, generates profiling reports, creates risk assessments, produces SBOM, and compiles complete Annex IV technical documentation.
Each pipeline run automatically versions all inputs/outputs, generates profiling reports, creates risk assessments, produces a [Software Bill of Materials (SBOM)](https://www.cisa.gov/sbom), and compiles complete Annex IV technical documentation.

## 🛠️ Project Structure

Expand Down Expand Up @@ -95,14 +95,26 @@ pip install -r requirements.txt
zenml init
```

3. Install [WhyLogs integration](https://docs.zenml.io/stacks/stack-components/data-validators/whylogs):
3. Install [WhyLogs integration](https://docs.zenml.io/stacks/stack-components/data-validators/whylogs) for data profiling:

```bash
zenml integration install whylogs -y
zenml integration install whylogs
zenml data-validator register whylogs_data_validator --flavor=whylogs
zenml stack update <STACK_NAME> -dv whylogs_data_validator
```

4. Install [Slack integration](https://docs.zenml.io/stacks/stack-components/alerters/slack) for deployment approval gate and incident reporting:

```bash
zenml integration install slack
zenml secret create slack_token --oauth_token=<SLACK_TOKEN>
zenml alerter register slack_alerter \
--flavor=slack \
--slack_token={{slack_token.oauth_token}} \
--slack_channel_id=<SLACK_CHANNEL_ID>
zenml stack update <STACK_NAME> -al slack_alerter
```

## 📊 Running Pipelines

### Basic Commands
Expand Down Expand Up @@ -134,7 +146,7 @@ To run the dashboard:
python run_dashboard.py
```

> **Note:** All compliance artifacts are also directly accessible through the ZenML dashboard. The Streamlit dashboard is provided as a convenient additional interface for browsing compliance information interactively.
> **Note:** All compliance artifacts are also directly accessible through the ZenML dashboard. The Streamlit dashboard is provided as a convenient additional interface for browsing compliance information locally and offline.

### 🔧 Configuration

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ This documentation supports compliance with the EU AI Act, particularly:
- **Article 15**: Accuracy, Robustness and Cybersecurity
- **Article 16**: Post-market monitoring

For a complete mapping of pipeline steps to EU AI Act articles, refer to the [pipeline_to_articles.md](../../pipeline_to_articles.md) file.
For a complete mapping of pipeline steps to EU AI Act articles, refer to the [pipeline_to_articles.md](../../guides/pipeline_to_articles.md) file.
99 changes: 90 additions & 9 deletions credit-scorer/modal_app/modal_deployment.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,6 @@ def create_modal_app(python_version: str = "3.12.9"):
"src/constants/annotations.py",
remote_path="/root/src/constants/annotations.py",
)
.add_local_file(
"src/utils/incidents.py",
remote_path="/root/src/utils/incidents.py",
)
.add_local_file(
"src/utils/storage.py",
remote_path="/root/src/utils/storage.py",
Expand Down Expand Up @@ -133,13 +129,98 @@ def _load_pipeline() -> Any:

def _report_incident(incident_data: dict, model_checksum: str) -> dict:
"""Report incidents to compliance team and log them (Article 18)."""
from src.utils.incidents import create_incident_report
import json
from datetime import datetime
from pathlib import Path

import requests
from src.constants.config import SlackConfig as SC

# Format incident report
incident = {
"incident_id": f"incident_{datetime.now().isoformat().replace(':', '-')}",
"timestamp": datetime.now().isoformat(),
"model_name": "credit_scoring_model",
"model_version": model_checksum,
"severity": incident_data.get("severity", "medium"),
"description": incident_data.get(
"description", "Unspecified incident"
),
"source": "modal_api",
"data": incident_data,
}

try:
# 1. Append to local log (if accessible)
incident_log_path = "docs/risk/incident_log.json"
try:
existing = []
if Path(incident_log_path).exists():
try:
with open(incident_log_path, "r") as f:
existing = json.load(f)
except json.JSONDecodeError:
existing = []

existing.append(incident)
with open(incident_log_path, "w") as f:
json.dump(existing, f, indent=2)
except Exception as e:
logger.warning(f"Could not write to local incident log: {e}")

# 2. Direct Slack notification for high/critical severity (not using ZenML)
if incident["severity"] in ("high", "critical"):
try:
slack_token = os.getenv("SLACK_BOT_TOKEN")
slack_channel = os.getenv("SLACK_CHANNEL_ID", SC.CHANNEL_ID)

if slack_token and slack_channel:
emoji = {"high": "🔴", "critical": "🚨"}[
incident["severity"]
]
message = (
f"{emoji} *Incident from Modal API:* {incident['description']}\n"
f">*Severity:* {incident['severity']}\n"
f">*Source:* {incident['source']}\n"
f">*Model Version:* {incident['model_version']}\n"
f">*Time:* {incident['timestamp']}\n"
f">*ID:* {incident['incident_id']}"
)

# Direct Slack API call
response = requests.post(
"https://slack.com/api/chat.postMessage",
headers={"Authorization": f"Bearer {slack_token}"},
json={
"channel": slack_channel,
"text": message,
"username": "Modal Incident Bot",
},
)

if response.status_code == 200:
incident["slack_notified"] = True
logger.info("Slack notification sent successfully")
else:
logger.warning(
f"Slack notification failed: {response.text}"
)
else:
logger.info(
"Slack credentials not available, skipping notification"
)
except Exception as e:
logger.warning(f"Failed to send Slack notification: {e}")

# Add deployment context
incident_data["source"] = "modal_api"
return {
"status": "reported",
"incident_id": incident["incident_id"],
"slack_notified": incident.get("slack_notified", False),
}

# Create the incident report
return create_incident_report(incident_data, model_checksum[:8])
except Exception as e:
logger.error(f"Error reporting incident: {e}")
return {"status": "error", "message": str(e)}


def _monitor_data_drift() -> dict:
Expand Down
2 changes: 1 addition & 1 deletion credit-scorer/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@
@click.option(
"--auto-approve",
is_flag=True,
default=True,
default=False,
help="Auto-approve deployment (for CI/CD pipelines).",
)
@click.option(
Expand Down
2 changes: 1 addition & 1 deletion credit-scorer/src/configs/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,6 @@ steps:
approve_deployment:
parameters:
approval_thresholds:
accuracy: 0.80
accuracy: 0.70
bias_disparity: 0.20
risk_score: 0.40
121 changes: 0 additions & 121 deletions credit-scorer/src/constants.py

This file was deleted.

8 changes: 4 additions & 4 deletions credit-scorer/src/constants/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,11 @@ def get_volume_metadata(cls) -> Dict[str, str]:
}


class Incident:
"""Incident reporting configuration."""
class SlackConfig:
"""Slack configuration parameters."""

SLACK_CHANNEL = "#credit-scoring-alerts"
SLACK_BOT_TOKEN = os.getenv("SLACK_BOT_TOKEN")
CHANNEL_ID = os.getenv("SLACK_CHANNEL_ID")
BOT_TOKEN = os.getenv("SLACK_BOT_TOKEN")


# Initialize required directories at module import
Expand Down
Loading