[python-package] Add decision_function() to LGBMClassifier

## Summary

I would like to add a `decision_function()` interface to LightGBM’s scikit-learn API (`LGBMClassifier`) to expose the model’s raw scores (margin / logit). This would allow scikit-learn’s `CalibratedClassifierCV` to prefer raw scores as calibration inputs, instead of falling back to probabilities produced by `predict_proba()`.

## Motivation

In practical applications, it is common to calibrate the posterior probabilities output by LightGBM classification models (probability calibration), e.g., for risk scoring, threshold-based decision making, and cost-sensitive classification.

scikit-learn’s `CalibratedClassifierCV` requires a continuous “score” from the base estimator as input to the calibrator. According to the documentation, calibration is based on the estimator’s `decision_function()` output if it exists; otherwise, it uses `predict_proba()`.

Currently, `LGBMClassifier` does not implement `decision_function()`. As a result, when users pass an `LGBMClassifier` into `CalibratedClassifierCV`, the calibrator can only use `predict_proba()` outputs.

However, for LightGBM, `predict_proba()` returns probabilities obtained after applying a sigmoid (binary classification) or softmax (multiclass classification) transformation to the raw scores (margins). Applying `CalibratedClassifierCV(method="sigmoid")` on top of these outputs effectively fits another monotonic mapping (a sigmoid calibrator) on already-probabilistic outputs. In practice, this often pushes predicted probabilities further toward the center (i.e., becoming more conservative / “compressed”), which can negatively affect both calibration quality and discriminative power.

Even with `method="isotonic"`, learning an additional monotonic mapping in probability space is generally less clean and natural than calibrating directly in raw-margin space. A more principled approach is to let the calibrator learn a mapping (sigmoid or isotonic) directly from raw scores (margin/logit), producing more reliable probabilities.

Therefore, adding `decision_function()` to `LGBMClassifier` to return raw scores would significantly improve compatibility with scikit-learn’s calibration tooling and the quality of calibration.

## Description

Add a `decision_function(X)` method to `LGBMClassifier` as an alias for raw score outputs.

Proposed behavior:
- **Binary classification**: return the raw margin (logit) for each sample
- **Multiclass classification**: return raw scores with shape `(n_samples, n_classes)`

Implementation can directly call LightGBM’s prediction interface with `raw_score=True`, e.g.:
- `predict(X, raw_score=True)` (or an equivalent path), ensuring the output is a raw score rather than a probability.

## References

- scikit-learn `CalibratedClassifierCV` documentation (decision_function preference):
  https://scikit-learn.org/stable/modules/generated/sklearn.calibration.CalibratedClassifierCV.html

> The calibration is based on the decision_function method of the estimator if it exists, else on predict_proba.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] Add decision_function() to LGBMClassifier #7158

Summary

Motivation

Description

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[python-package] Add decision_function() to LGBMClassifier #7158

Description

Summary

Motivation

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions