Skip to content

Commit b7b4681

Browse files
committed
Move roadmap into docs site
1 parent 4f5b3d5 commit b7b4681

File tree

2 files changed

+52
-2
lines changed

2 files changed

+52
-2
lines changed

docs/_quarto.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ project:
66
- benchmarks.qmd
77
- optimizers.qmd
88
- estimators.qmd
9+
- roadmap.qmd
910
- notebooks.qmd
1011
- reference/*.qmd
1112
- notebooks/example.ipynb
@@ -29,6 +30,8 @@ website:
2930
text: Benchmarks
3031
- href: estimators.qmd
3132
text: Estimators
33+
- href: roadmap.qmd
34+
text: Roadmap
3235
- href: reference/index.qmd
3336
text: API
3437
- href: notebooks.qmd
Lines changed: 49 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,31 @@
1+
---
2+
title: Roadmap
3+
---
4+
15
# Econometrics and Supervised Learning Roadmap
26

37
This document collects proposed functionality expansions for `pyensmallen`, based on the existing notebooks and current API surface.
48

9+
## Current Status
10+
11+
Implemented on `master`:
12+
13+
- Estimator classes for `LinearRegression`, `LogisticRegression`, and `PoissonRegression`
14+
- fitted attributes including `coef_`, `intercept_`, covariance estimates, confidence intervals, and `summary()`
15+
- classical and robust sandwich standard errors for unregularized OLS, logit, and Poisson
16+
- exact L1 and L2 regularization for the core estimator classes via backend solver switching
17+
- Quarto docs, generated API reference, benchmark page, and executed notebook pages
18+
- macOS wheel repair for vendored BLAS linkage and post-patch ad-hoc codesigning
19+
20+
Still outstanding from the original roadmap:
21+
22+
- true separable-objective and mini-batch training support
23+
- productized JAX objective bridge
24+
- richer inference utilities beyond the current robust covariance path
25+
- workflow-level evaluation and model-selection helpers
26+
- formula and DataFrame ergonomics
27+
- additional estimator classes beyond the current linear / logit / Poisson set
28+
529
## First Tranche
630

731
The first set of items to prioritize:
@@ -10,12 +34,15 @@ The first set of items to prioritize:
1034
2. First-class regularization support
1135
3. Proper stochastic / mini-batch training support
1236

13-
These are the highest-leverage additions for making `pyensmallen` useful beyond optimizer demos and low-level objective wrappers.
37+
The first two are now in place. The remaining item in this tranche is proper stochastic / mini-batch training support.
1438

1539
## Full Proposal List
1640

1741
### 1. Estimator classes for common supervised models
1842

43+
Status:
44+
Partially complete. `LinearRegression`, `LogisticRegression`, and `PoissonRegression` now exist. Multinomial and other nonlinear estimators remain open.
45+
1946
Add estimator APIs for standard econometrics and ML models:
2047

2148
- `LinearRegression`
@@ -41,6 +68,9 @@ The current API is objective-first. Real workflows usually want model objects, n
4168

4269
### 2. First-class regularization support
4370

71+
Status:
72+
Partially complete. Exact L1 and L2 support is implemented for the core estimator classes. Mixed elastic net, regularization paths, and CV selection remain open.
73+
4474
Add penalized estimation support across core models:
4575

4676
- L1
@@ -56,6 +86,9 @@ This is central to both supervised learning and modern econometrics, especially
5686

5787
### 3. Productized JAX bridge
5888

89+
Status:
90+
Not started as library surface. The notebook pattern exists, but there is still no supported wrapper API.
91+
5992
Turn the current notebook pattern into a supported API:
6093

6194
- `JaxObjective`
@@ -74,6 +107,9 @@ The multinomial logit notebook already shows this is useful. It should be librar
74107

75108
### 4. Proper stochastic / mini-batch training support
76109

110+
Status:
111+
Not started. This remains the next major ML-side gap.
112+
77113
Expose true separable-objective support for first-order optimizers:
78114

79115
- mini-batch iteration
@@ -94,6 +130,9 @@ The Adam-family bindings exist, but the current wrapper behaves like full-batch
94130

95131
### 5. Inference utilities beyond point estimation
96132

133+
Status:
134+
Partially complete. Classical and robust sandwich covariance are available for unregularized OLS, logit, and Poisson. Clustered, HAC, tests, marginal effects, and bootstrap helpers remain open.
135+
97136
Expand the econometrics side with reusable inference tools:
98137

99138
- sandwich covariance
@@ -110,6 +149,9 @@ The package already goes in this direction for GMM. Extending it to MLE models w
110149

111150
### 6. Model selection and evaluation tools
112151

152+
Status:
153+
Not started as library functionality.
154+
113155
Add workflow-level evaluation and tuning utilities:
114156

115157
- train / validation splitting
@@ -132,6 +174,9 @@ Several notebooks currently hand-roll comparison and tuning logic that should li
132174

133175
### 7. Higher-level causal and panel estimators
134176

177+
Status:
178+
Still mostly out of scope for this repo; the sibling `synthlearners` repository remains the main home for panel estimators.
179+
135180
Potential estimator layer additions include:
136181

137182
- `SyntheticControl`
@@ -147,6 +192,9 @@ This is a natural applied econometrics extension, though a substantial part of t
147192

148193
### 8. Formula and DataFrame ergonomics
149194

195+
Status:
196+
Not started.
197+
150198
Improve usability for empirical workflows:
151199

152200
- formula interface
@@ -177,4 +225,3 @@ Current working assumption:
177225

178226
- `pyensmallen` should focus on optimization primitives, reusable objectives, supervised estimators, autodiff integration, and inference utilities.
179227
- `synthlearners` should remain the home for most panel and synthetic-control estimators, while depending on `pyensmallen` where useful.
180-

0 commit comments

Comments
 (0)