Skip to content

[bugfix] support negative-only pointwise reranker data and fix evaluation metrics#8503

Open
yinpu wants to merge 5 commits intomodelscope:mainfrom
yinpu:fix/reranker-pointwise-negative-only
Open

[bugfix] support negative-only pointwise reranker data and fix evaluation metrics#8503
yinpu wants to merge 5 commits intomodelscope:mainfrom
yinpu:fix/reranker-pointwise-negative-only

Conversation

@yinpu
Copy link
Copy Markdown

@yinpu yinpu commented Apr 2, 2026

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

This PR fixes several reranker evaluation and data handling issues around pointwise training.

Background

The current reranker pipeline has a few gaps in pointwise evaluation/training:

  • negative-only pointwise samples are dropped by the data collator, although they are valid
    for BCE-style pointwise reranker training
  • evaluation metrics can regress when query group sizes are not inferred correctly
  • metric gathering/publishing is not stable when group_sizes needs to be carried through
    evaluation

These issues can lead to incomplete training data usage and incorrect reranker metrics during
evaluation.

Changes

This PR includes the following fixes:

  • support negative-only samples in the pointwise reranker data collator
  • preserve group_sizes for pointwise reranker evaluation so query boundaries are explicit
  • fix reranker evaluation behavior when query groups have variable sizes
  • fix reranker metric gather logic to correctly handle (labels, group_sizes) tuples
  • fix reranker metric publishing in evaluation loops while preserving the metric key prefix
  • add regression tests covering collator behavior, metric calculation, and trainer evaluation
    flow

Affected areas

  • swift/template/base.py
  • swift/metrics/reranker.py
  • swift/trainers/reranker_trainer.py

Test coverage

Added/updated tests:

  • tests/train/test_reranker_collator.py
  • tests/train/test_reranker_metrics.py
  • tests/train/test_reranker_trainer.py

Covered cases include:

  • pointwise reranker supports negative-only samples
  • pointwise reranker supports positive-only samples
  • listwise reranker behavior remains unchanged for negative-only data
  • group_sizes is only emitted when needed
  • metric calculation works with explicit query group boundaries
  • trainer evaluation keeps metric_key_prefix
  • gather logic preserves tuple labels for reranker metrics

Compatibility / Risk

This PR only changes reranker-specific collation and evaluation paths.

Main risk:

  • unintended behavior change for existing reranker evaluation flows

Mitigation:

  • regression tests were added for the affected pointwise/listwise collator and trainer paths

Experiment results

Unit tests added for reranker collator, metrics, and trainer regression cases.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for pointwise reranking and improves query boundary tracking using a new group_sizes attribute. Key changes include updating the reranker data collator to handle negative-only samples for pointwise loss, enhancing RerankerMetrics to calculate classification metrics (accuracy, precision, recall, and F1), and modifying the trainer to propagate group_sizes through the evaluation loop. Additionally, new test suites were added to verify the collator, metrics, and trainer logic. I have no feedback to provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant