learn_one Should Return self (API Consistency)

# Bug Report & Proposed Fix

**Component:** `river/ensemble/stacking.py`
**Class:** `StackingClassifier`
**Library:** River

---

## Summary

This report proposes API consistency and feature safety fixes for `StackingClassifier`. The current implementation contains issues that may lead to broken method chaining, silent feature corruption, and inconsistent capability reporting.

These fixes are backward compatible and improve robustness for streaming ensemble learning.

---

## Issue 1 — `learn_one` Does Not Return `self`

### Problem

River estimators are expected to return `self` from `learn_one` to support method chaining and API consistency.

Current implementation ends without a return statement, causing `None` to be returned.

### Impact

Breaks usage patterns such as:

```python
model.learn_one(x, y).predict_proba_one(x)
```

### Fix

Add a return statement at the end of `learn_one`:

```python
self.meta_classifier.learn_one(oof, y)
return self
```

---

## Issue 2 — Feature Name Collision When `include_features=True`

### Problem

Original features are merged directly into the meta-feature dictionary:

```python
if self.include_features:
    oof.update(x)
```

If an input feature name matches a stacking feature (e.g. `"oof_0_True"`), it will overwrite the prediction feature silently.

### Impact

Silent corruption of meta-model training data leading to degraded or unstable performance.

### Fix

Namespace original features to prevent collision:

```python
if self.include_features:
    oof.update({f"orig_{k}": v for k, v in x.items()})
```

---

## Issue 3 — Meta-Feature Space Drift with New Classes

### Problem

Base classifiers may not output probabilities for unseen classes. When a new class appears later in the stream, new meta-features are introduced dynamically.

### Impact

Non-stationary feature space for the meta-classifier may slow convergence and introduce instability.

### Suggested Improvement

Ensure probabilities for all known classes are included, defaulting to 0.0 when absent. This may require standardized class tracking across River classifiers.

---

## Issue 4 — `_multiclass` Property May Misrepresent Capability

### Problem

The current implementation relies only on the meta-classifier to determine multiclass capability:

```python
@property
def _multiclass(self):
    return self.meta_classifier._multiclass
```

This ignores the capabilities of base models.

### Impact

May incorrectly advertise the ensemble as binary-only or multiclass when underlying models differ.

### Fix

Use both base and meta models to determine capability:

```python
@property
def _multiclass(self):
    return (
        all(getattr(model, "_multiclass", False) for model in self)
        and getattr(self.meta_classifier, "_multiclass", False)
    )
```

---

## Proposed Patch (Combined)

```python
def learn_one(self, x, y):
    oof = {}

    for i, clf in enumerate(self):
        y_pred = clf.predict_proba_one(x)
        for k, p in y_pred.items():
            oof[f"oof_{i}_{k}"] = p
        clf.learn_one(x, y)

    if self.include_features:
        oof.update({f"orig_{k}": v for k, v in x.items()})

    self.meta_classifier.learn_one(oof, y)
    return self

@property
def _multiclass(self):
    return (
        all(getattr(model, "_multiclass", False) for model in self)
        and getattr(self.meta_classifier, "_multiclass", False)
    )
```

---

## Benefits of This Fix

✔ Restores River API consistency
✔ Prevents silent feature overwrites
✔ Improves stability in streaming classification
✔ More accurate capability reporting

---

These changes improve reliability while keeping the behavior aligned with River's online learning design principles.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

learn_one Should Return self (API Consistency) #1752

Bug Report & Proposed Fix

Summary

Issue 1 — `learn_one` Does Not Return `self`

Problem

Impact

Fix

Issue 2 — Feature Name Collision When `include_features=True`

Problem

Impact

Fix

Issue 3 — Meta-Feature Space Drift with New Classes

Problem

Impact

Suggested Improvement

Issue 4 — `_multiclass` Property May Misrepresent Capability

Problem

Impact

Fix

Proposed Patch (Combined)

Benefits of This Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

learn_one Should Return self (API Consistency) #1752

Description

Bug Report & Proposed Fix

Summary

Issue 1 — learn_one Does Not Return self

Problem

Impact

Fix

Issue 2 — Feature Name Collision When include_features=True

Problem

Impact

Fix

Issue 3 — Meta-Feature Space Drift with New Classes

Problem

Impact

Suggested Improvement

Issue 4 — _multiclass Property May Misrepresent Capability

Problem

Impact

Fix

Proposed Patch (Combined)

Benefits of This Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Issue 1 — `learn_one` Does Not Return `self`

Issue 2 — Feature Name Collision When `include_features=True`

Issue 4 — `_multiclass` Property May Misrepresent Capability