Skip to content

Commit 67e4e70

Browse files
committed
Update documentation and plotting code
1 parent cbf94d2 commit 67e4e70

File tree

6 files changed

+161
-18
lines changed

6 files changed

+161
-18
lines changed

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ Docs: https://rollingstorms.github.io/opproplot
66

77
![Opproplot hero](docs/assets/opproplot_hero.png)
88

9+
<u>OP</u>erating <u>PRO</u>file <u>PLOT</u> ← Opproplot spelled out.
10+
911
**What is an Operating Profile Plot?**
1012

1113
An Operating Profile Plot (Opproplot) is a unified visualization for binary classifiers that shows how a model behaves across every possible decision threshold. It combines:
@@ -15,6 +17,8 @@ An Operating Profile Plot (Opproplot) is a unified visualization for binary clas
1517

1618
This creates a complete operating profile of the model in a single view — letting you see where the model is confident, where the classes overlap, and how performance changes as you move the threshold.
1719

20+
It is a compact, multidimensional readout of model behavior: score distribution by class plus operating curves (TPR/FPR/accuracy) on the same axis. Comparing profiles across models or datasets shows whether a model separates classes cleanly, where overlap drives errors, and how threshold choices shift business metrics.
21+
1822
Rather than switching between ROC curves, PR curves, histograms, and calibration plots, Opproplot places the score distribution and the operating characteristics on the same axis, making it easy to:
1923
- identify thresholds with optimal trade-offs
2024
- diagnose where errors occur in score space
@@ -77,6 +81,7 @@ operating_profile_plot(y_test, y_score, bins=30)
7781
- Package code lives in `src/opproplot`.
7882
- Tests live in `tests/`.
7983
- Documentation for GitHub Pages lives in `docs/` (see below).
84+
- Regenerate doc images with `python scripts/generate_docs_images.py` (requires numpy, matplotlib, scikit-learn).
8085

8186
## Documentation site
8287

docs/api.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,6 @@ profile = compute_operating_profile(y_true, y_score, bins=40, score_range=(0, 1)
1111
- `y_score`: array-like of shape (n_samples,), predicted scores or probabilities.
1212
- `bins`: number of score bins (default 40).
1313
- `score_range`: tuple or None. If None, uses min/max of scores.
14-
- `show_key`: display combined legend for bars and lines (default True).
15-
- `key_location`: `"inside"` (axis legend) or `"outside"` (fig-level, right dock).
16-
- `show_grid`: draw a background grid on the metric axis (default False).
17-
- `grid_kwargs`: dict passed to `ax_metric.grid`, e.g., `{"alpha": 0.2, "linestyle": "--"}`.
1814

1915
Returns an `OperatingProfile` dataclass with:
2016
- `edges`, `mids`, `pos_hist`, `neg_hist`, `tpr`, `fpr`, `accuracy`.
@@ -23,10 +19,28 @@ Returns an `OperatingProfile` dataclass with:
2319

2420
```python
2521
from opproplot import operating_profile_plot
26-
fig, ax_hist, ax_metric = operating_profile_plot(y_true, y_score, bins=30, show_accuracy=True)
22+
fig, ax_hist, ax_metric = operating_profile_plot(
23+
y_true,
24+
y_score,
25+
bins=30,
26+
show_accuracy=True,
27+
show_key=True,
28+
key_location="inside",
29+
show_grid=False,
30+
title=None,
31+
)
2732
```
2833

34+
- `y_true`: array-like of shape (n_samples,), binary labels.
35+
- `y_score`: array-like of shape (n_samples,), predicted scores or probabilities.
36+
- `bins`: number of score bins (default 40).
37+
- `score_range`: tuple or None. If None, uses min/max of scores.
2938
- `show_accuracy`: include the dashed accuracy curve (default True).
39+
- `show_key`: display combined legend for bars and lines (default True).
40+
- `key_location`: `"inside"` (axis legend) or `"outside"` (fig-level, right dock).
41+
- `show_grid`: draw a background grid on the metric axis (default False).
42+
- `grid_kwargs`: dict passed to `ax_metric.grid`, e.g., `{"alpha": 0.2, "linestyle": "--"}`.
43+
- `title`: optional title string; defaults to "Opproplot: Operating Profile".
3044
- `ax`: optional Matplotlib axis to draw on; otherwise creates a new figure.
3145

3246
Returns `(fig, ax_hist, ax_metric)` for further styling or saving.

docs/examples.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,23 @@
22

33
Use these patterns to compare models and datasets.
44

5-
## Breast cancer (scikit-learn)
5+
## Clear separation (breast cancer, scikit-learn)
66

77
- Load `sklearn.datasets.load_breast_cancer`.
88
- Train a logistic regression or gradient boosting model.
99
- Plot the operating profile on the test split to inspect separability.
10+
- Interpretation: distributions are well separated; TPR stays high while FPR stays low across much of the threshold range.
1011

11-
## Fraud-like imbalance
12+
## Ambiguous scores (overlapping normals)
1213

13-
- Simulate or load an imbalanced dataset.
14-
- Compare a calibrated model vs an overconfident one.
15-
- Observe how class imbalance alters histogram heights and accuracy peaks.
14+
- Simulate scores from two overlapping normal distributions with similar means/variance.
15+
- Expect intertwined histograms and TPR/FPR curves that cross more frequently.
16+
- Interpretation: thresholds are fragile; small shifts move a lot of examples between classes.
1617

17-
## Good vs bad model
18+
## Bumpy distributions (mixed pockets)
1819

19-
- Train two models on the same data.
20-
- Plot both operating profiles side by side.
21-
- Look for:
22-
- Separation of score distributions.
23-
- Lower FPR for the same TPR.
24-
- Stability of accuracy across thresholds.
20+
- Build a model that produces multi-modal scores (e.g., mixture components or segment-specific calibrations).
21+
- Look for “bumps” in the histogram and corresponding inflections in TPR/FPR.
22+
- Interpretation: localized score clusters may indicate subpopulations; thresholding there can create sharp metric changes.
2523

2624
Swap in your own datasets; the plotting API stays the same.

docs/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Opproplot
22

3+
<u>OP</u>erating <u>PRO</u>file <u>PLOT</u>
4+
35
A compact operating profile plot for binary classifiers: stacked score histograms by class plus TPR/FPR/Accuracy curves at bin-midpoint thresholds. One view to understand every possible cutoff.
46

57
![Opproplot hero](assets/opproplot_hero.png)
@@ -13,6 +15,8 @@ An Operating Profile Plot (Opproplot) is a unified visualization for binary clas
1315

1416
This creates a complete operating profile of the model in a single view — letting you see where the model is confident, where the classes overlap, and how performance changes as you move the threshold.
1517

18+
It is a compact, multidimensional readout of model behavior: score distribution by class plus operating curves (TPR/FPR/accuracy) on the same axis. Comparing profiles across models or datasets shows whether a model separates classes cleanly, where overlap drives errors, and how threshold choices shift business metrics.
19+
1620
Rather than switching between ROC curves, PR curves, histograms, and calibration plots, Opproplot places the score distribution and the operating characteristics on the same axis, making it easy to:
1721
- identify thresholds with optimal trade-offs
1822
- diagnose where errors occur in score space

scripts/generate_docs_images.py

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
"""
2+
Generate documentation images for Opproplot.
3+
4+
Creates:
5+
- docs/assets/opproplot_hero.png
6+
- docs/assets/opproplot_example.png
7+
- docs/assets/opproplot_breast_cancer.png
8+
"""
9+
10+
import os
11+
from pathlib import Path
12+
13+
import matplotlib
14+
15+
matplotlib.use("Agg")
16+
import matplotlib.pyplot as plt # noqa: E402
17+
import numpy as np # noqa: E402
18+
from sklearn.datasets import load_breast_cancer # noqa: E402
19+
from sklearn.linear_model import LogisticRegression # noqa: E402
20+
from sklearn.model_selection import train_test_split # noqa: E402
21+
22+
from opproplot import operating_profile_plot # noqa: E402
23+
24+
25+
ASSETS_DIR = Path("docs/assets")
26+
27+
28+
def _ensure_assets_dir() -> None:
29+
ASSETS_DIR.mkdir(parents=True, exist_ok=True)
30+
31+
32+
def generate_hero() -> None:
33+
rng = np.random.default_rng(2)
34+
y_true = rng.integers(0, 2, size=4000)
35+
scores = rng.normal(loc=y_true * 0.7 + 0.08, scale=0.3, size=4000)
36+
scores = np.clip(scores, 0, 1)
37+
38+
fig, ax_hist, ax_metric = operating_profile_plot(
39+
y_true,
40+
scores,
41+
bins=24,
42+
show_accuracy=True,
43+
show_key=True,
44+
key_location="outside",
45+
show_grid=False,
46+
title="Operating Profile Plot",
47+
)
48+
49+
# Minimal styling for hero
50+
for ax in (ax_hist, ax_metric):
51+
ax.set_xlabel("")
52+
ax.set_ylabel("")
53+
ax.tick_params(labelbottom=False, labelleft=False, labelright=False, length=0)
54+
for spine in ax.spines.values():
55+
spine.set_visible(False)
56+
57+
fig.set_size_inches(4.6, 2.4)
58+
fig.tight_layout(pad=0.4)
59+
fig.savefig(ASSETS_DIR / "opproplot_hero.png", dpi=220, transparent=True, bbox_inches="tight")
60+
plt.close(fig)
61+
62+
63+
def generate_simulated_example() -> None:
64+
rng = np.random.default_rng(0)
65+
y_true = rng.integers(0, 2, size=5000)
66+
scores = rng.random(size=5000)
67+
68+
fig, _, _ = operating_profile_plot(
69+
y_true,
70+
scores,
71+
bins=30,
72+
show_accuracy=True,
73+
show_key=True,
74+
key_location="inside",
75+
show_grid=False,
76+
title="Opproplot: Operating Profile",
77+
)
78+
fig.tight_layout()
79+
fig.savefig(ASSETS_DIR / "opproplot_example.png", dpi=200)
80+
plt.close(fig)
81+
82+
83+
def generate_breast_cancer() -> None:
84+
data = load_breast_cancer()
85+
X_train, X_test, y_train, y_test = train_test_split(
86+
data.data, data.target, test_size=0.25, random_state=0, stratify=data.target
87+
)
88+
clf = LogisticRegression(max_iter=1000)
89+
clf.fit(X_train, y_train)
90+
y_score = clf.predict_proba(X_test)[:, 1]
91+
92+
fig, ax_hist, _ = operating_profile_plot(
93+
y_test,
94+
y_score,
95+
bins=30,
96+
show_accuracy=True,
97+
show_key=True,
98+
key_location="inside",
99+
show_grid=False,
100+
title="Breast cancer classifier: operating profile",
101+
)
102+
ax_hist.set_title("Breast cancer classifier: operating profile", fontsize=11)
103+
fig.tight_layout()
104+
fig.savefig(ASSETS_DIR / "opproplot_breast_cancer.png", dpi=200)
105+
plt.close(fig)
106+
107+
108+
def main() -> None:
109+
_ensure_assets_dir()
110+
generate_hero()
111+
generate_simulated_example()
112+
generate_breast_cancer()
113+
print("Generated docs images in docs/assets/")
114+
115+
116+
if __name__ == "__main__":
117+
main()

src/opproplot/plotting.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ def operating_profile_plot(
1717
key_location: str = "inside",
1818
show_grid: bool = False,
1919
grid_kwargs: Optional[dict] = None,
20+
title: Optional[str] = None,
2021
ax: Optional[plt.Axes] = None,
2122
):
2223
"""
@@ -44,6 +45,8 @@ def operating_profile_plot(
4445
grid_kwargs : dict or None, default=None
4546
Passed to `ax_metric.grid`; useful keys include `alpha`, `color`,
4647
`linestyle`, and `linewidth`.
48+
title : str or None, default=None
49+
Title for the histogram axis. If None, uses "Opproplot: Operating Profile".
4750
ax : matplotlib.axes.Axes or None, default=None
4851
Axis to plot on. If None, a new figure and axis are created.
4952
@@ -122,7 +125,9 @@ def operating_profile_plot(
122125
if show_grid:
123126
ax_metric.grid(True, **(grid_kwargs or {"alpha": 0.2, "linestyle": "--"}))
124127

125-
ax_hist.set_title("Opproplot: Operating Profile")
128+
if title is None:
129+
title = "Opproplot: Operating Profile"
130+
ax_hist.set_title(title)
126131

127132
fig.tight_layout()
128133
return fig, ax_hist, ax_metric

0 commit comments

Comments
 (0)