Skip to content

Conversation

@dimoibiehg
Copy link

…i-class classification (based on ip200/venn-abers github repository implementation)

Description

Implement Vann-Abers calibration method for both binary and multi-class classification based on this repository.

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

A complete coverage test cases, implemented in mapie/tests/test_venn_abers_calibration.py

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully: make lint
  • Typing passes successfully: make type-check
  • Unit tests pass successfully: make tests
  • Coverage is 100%: make coverage
  • When updating documentation: doc builds successfully and without warnings: make doc
  • When updating documentation: code examples in doc run successfully: make doctest

…i-class classification (based on ip200/venn-abers github repository implementation)
@dimoibiehg
Copy link
Author

The result of make type-check command for me is as follows (on Ubuntu 24.04, Python 3.12, and 3.11):

mypy mapie
mapie/tests/test_non_regression.py:205: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
mapie/tests/test_non_regression.py:304: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
mapie/tests/test_non_regression.py:674: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
Success: no issues found in 60 source files

I will review the server logs and get back to you.

@allglc
Copy link
Collaborator

allglc commented Oct 17, 2025

Thanks for the PR @dimoibiehg !
We'll review it as soon as possible.

@Valentin-Laurent
Copy link
Collaborator

Adding a bit of context here: #736

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Implements Venn–Abers calibration for binary and multi-class classification, adding a new calibrator class, core VA algorithms, and extensive tests.

  • Adds VennAbersCalibrator to mapie.calibration with prefit, inductive (IVAP), and cross (CVAP) modes.
  • Introduces core Venn–Abers implementations (binary, CV, multiclass) in mapie/_venn_abers.py.
  • Adds comprehensive unit tests and updates HISTORY and AUTHORS.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
mapie/calibration.py Adds VennAbersCalibrator with fit/predict_proba/predict logic and pipeline handling.
mapie/_venn_abers.py New internal module implementing VA core algorithms (binary, CV, multiclass) and helper utilities.
mapie/tests/test_venn_abers_calibration.py Extensive tests covering modes, estimators, pipelines, weights, precision, and error cases.
HISTORY.rst Notes the introduction of Venn–Abers calibrator.
AUTHORS.rst Adds contributor.
mapie/utils.py Minor whitespace cleanup.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 901 to 905
# Validate inputs
X, y = indexable(X, y)
y = _check_y(y)
sample_weight, X, y = _check_null_weight(sample_weight, X, y)
# Handle categorical features
Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indexable is used but never imported in this module, which will lead to a NameError at runtime. Import it with from sklearn.utils import indexable near the other imports.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not correct, indexable has been imported from sklearn.utils.validation

Comment on lines +912 to +918
if isinstance(last_estimator, Pipeline):
# Separate transformers and final estimator
transformers = self.estimator[:-1] # all steps except last
last_estimator = self.estimator[-1] # usually a classifier

X_processed = transformers.fit_transform(X, y)
self.transformers_ = transformers
Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fitting the pipeline transformers on the full X before any inductive/CV split causes data leakage; in prefit mode it also decouples the transformer state from the estimator that was trained on a different transformer fit. Pass the full Pipeline to the underlying calibrator so each fold/split fits transformations on training data, and avoid refitting transformers in prefit mode.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The estimator is fitted in the prefit mode; however, the transformers might not!

Comment on lines 1009 to 1021
if (self.transformers_ is not None):
X_processed = self.transformers_.transform(X)
else:
X_processed = X
# Prefit mode: use fitted estimator to get probabilities, then calibrate
if cv == "prefit":
if self.single_estimator_ is None:
raise RuntimeError(
"single_estimator_ should not be None in prefit mode"
)

p_test_pred = self.single_estimator_.predict_proba(X_processed)

Copy link

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In prefit mode with Pipeline estimators, using a transformer refit on calibration data (self.transformers_) to transform X for a classifier trained with a different transformer fit can produce feature mismatch. For prefit, call the fitted pipeline directly (no manual transform) or reuse the original fitted transformer from the pipeline instead of refitting it during fit.

Suggested change
if (self.transformers_ is not None):
X_processed = self.transformers_.transform(X)
else:
X_processed = X
# Prefit mode: use fitted estimator to get probabilities, then calibrate
if cv == "prefit":
if self.single_estimator_ is None:
raise RuntimeError(
"single_estimator_ should not be None in prefit mode"
)
p_test_pred = self.single_estimator_.predict_proba(X_processed)
if cv == "prefit":
if self.single_estimator_ is None:
raise RuntimeError(
"single_estimator_ should not be None in prefit mode"
)
# If estimator is a pipeline, call it directly on X
if isinstance(self.single_estimator_, Pipeline):
p_test_pred = self.single_estimator_.predict_proba(X)
else:
if (self.transformers_ is not None):
X_processed = self.transformers_.transform(X)
else:
X_processed = X
p_test_pred = self.single_estimator_.predict_proba(X_processed)

Copilot uses AI. Check for mistakes.
@Valentin-Laurent
Copy link
Collaborator

Thank you @dimoibiehg for getting back to us :)

I just launched a Copilot review, but you can ignore it for now

@allglc allglc changed the title feat: Implement VannAbers calibration method for both binary and mult… FEAT: Implement VennAbers calibration method for both binary and mult… Oct 29, 2025
@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (5ee9406) to head (3c64f79).
⚠️ Report is 14 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##            master      #777     +/-   ##
===========================================
  Coverage   100.00%   100.00%             
===========================================
  Files           56        58      +2     
  Lines         6325      7894   +1569     
  Branches       360       434     +74     
===========================================
+ Hits          6325      7894   +1569     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@allglc
Copy link
Collaborator

allglc commented Oct 29, 2025

Hello @dimoibiehg,

I started reviewing your code. To make it clearer, could you explain what you did?

From what I understood, you started from this repository, most code being in _venn_abers.py except VennAbersCalibrator which has moved to calibration.py. You added the option to handle sample_weights which is nice.

I noticed that you removed cv_ensemble option from VennAbersCV and do it another way, could you clarify please?

Finally, what is your progress on this PR? What remains to be done?

Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants