Skip to content

Conversation

@Jintao-Huang
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug in the calculation of reranker metrics by refining how logits are processed and selected before evaluation. The changes streamline the logic for handling padding_free templates within the preprocessing function and adjust the metric calculation method to align with these updates, ensuring that the correct token scores are consistently used for accurate reranker performance assessment.

Highlights

  • Reranker Logits Preprocessing: The _preprocess_generative_reranker_logits method has been updated to correctly apply the selection of the last token's logits (logits[:, -1]) when the padding_free template is active, centralizing this logic.
  • Metric Calculation Refinement: The calculate_metric function's logic for extracting logits has been refined. It now relies on the preprocessing step for padding_free scenarios and correctly applies the last valid token selection for non-padding-free cases, ensuring accurate metric computation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the reranker metric calculation by moving the padding_free logic into the _preprocess_generative_reranker_logits function. While this is a good structural change, the implementation in _preprocess_generative_reranker_logits introduces a potential runtime error by not correctly handling unexpected logit shapes. I've provided a suggestion to fix this issue while retaining the intended refactoring.

@Jintao-Huang Jintao-Huang changed the title [bugfix] fix reranker metrics [bugfix] refactor reranker Jan 9, 2026
@Jintao-Huang Jintao-Huang changed the title [bugfix] refactor reranker [reranker] refactor reranker Jan 9, 2026
@Jintao-Huang
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the handling of generative rerankers by centralizing the logit processing logic into a new utility function, get_generative_reranker_logits. This significantly simplifies the code in the trainer, loss functions, and inference engine, reducing duplication and improving maintainability.

The changes are well-structured, but I've identified a critical issue in the inference engine where the new utility function is called with incorrect arguments, which will lead to a runtime error. I've also suggested a minor improvement for error handling in the new utility function to make it more robust.

Overall, this is a great refactoring. Once the identified issues are addressed, the code will be much cleaner.

Comment on lines +542 to +543
positive_token_id = tokenizer.convert_tokens_to_ids(positive_token)
negative_token_id = tokenizer.convert_tokens_to_ids(negative_token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The tokenizer.convert_tokens_to_ids call can fail if the positive or negative tokens (e.g., 'yes'/'no') are not present in the tokenizer's vocabulary. The previous implementation handled this with a try...except block. It's good practice to add similar error handling here to provide a more informative error message to the user.

Suggested change
positive_token_id = tokenizer.convert_tokens_to_ids(positive_token)
negative_token_id = tokenizer.convert_tokens_to_ids(negative_token)
try:
positive_token_id = tokenizer.convert_tokens_to_ids(positive_token)
negative_token_id = tokenizer.convert_tokens_to_ids(negative_token)
except Exception as e:
raise ValueError(
f"Failed to convert reranker tokens '{positive_token}' or '{negative_token}' to IDs. "
f"Please check if these tokens exist in the tokenizer vocabulary. Error: {e}"
) from e

@Jintao-Huang Jintao-Huang merged commit 528e0c6 into modelscope:main Jan 9, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants