Skip to content

Conversation

@zhangbilang
Copy link
Collaborator

No description provided.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @SmudgedWings, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello, Gemini here! I've reviewed this pull request from @SmudgedWings. The primary goal of this PR, based on the title, is to introduce support for the Llava-LHT model within the framework, with a focus on enabling testing for quantization methods like AWQ and potentially MME-pretrain evaluations. The changes involve adding a new model implementation specifically tailored for Llava-LHT and integrating it into the existing model handling and evaluation infrastructure. Additionally, some runtime dependencies have been updated.

Highlights

  • Add Llava-LHT Model Support: Introduces a new model class LlavaLHT and its evaluation counterpart LlavaLHTEval to integrate the Llava-LHT vision-language model into the framework.
  • Handle Llava-LHT Specifics: Implements custom methods in LlavaLHT to correctly load the model, identify its vision and language components, and adapt its forward and generate methods to work around specific behaviors (e.g., handling cache_position, input_ids/inputs, and attention_mask). This ensures compatibility with the framework's processing and evaluation pipelines.
  • Update Dependencies: Increments the minimum required PyTorch version to 2.2.0 and adjusts the listing of the lmms-eval dependency in the runtime requirements file.

Changelog

Click here to see the changelog
  • llmc/models/init.py
    • Added import for the new LlavaLHT model class (line 13).
  • llmc/models/llava_lht.py
    • Added a new file containing the LlavaLHT and LlavaLHTEval classes.
    • Implemented LlavaLHT inheriting from Llama, with custom methods (build_model, get_extra_rot_module_besides_embed_layers, find_blocks, get_layernorms_in_block, get_subsets_in_block) to integrate the Llava-LHT architecture.
    • Added workarounds for vlm_model.forward, vlm_model.generate, and vlm_model.prepare_inputs_for_generation to handle specific model behaviors.
    • Implemented LlavaLHTEval inheriting from lmms_eval.models.llava.Llava to handle evaluation specifics for Llava-LHT.
  • requirements/runtime.txt
    • Updated the minimum required PyTorch version from 2.1.0 to 2.2.0 (line 1).
    • Added lmms-eval to the main dependency list (line 7).
    • Removed a duplicate or misplaced entry for lmms-eval (line 34).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Image meets the text,
Llava sees and understands,
New model arrives.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the llava-lht model, including its evaluation counterpart. The changes involve adding a new model class LlavaLHT and an evaluation class LlavaLHTEval, along with updating the model registry and runtime requirements. The implementation includes custom handling for the llava-lht model's specific behaviors, such as parameter naming and generation arguments. Overall, the code adds necessary components for integrating this new model.

Summary of Findings

  • Monkey Patching: The implementation uses monkey patching to modify core model methods (forward, generate, prepare_inputs_for_generation). This approach can be fragile and less maintainable compared to using hooks or wrapper classes.
  • Device Mapping Logic: The device assignment logic in LlavaLHTEval.__init__ appears complex and could potentially be simplified for clarity and robustness.
  • Disallowing **kwargs: The LlavaLHTEval constructor strictly disallows any extra keyword arguments, which might limit future extensibility.
  • DEEPSPEED Configuration Requirement: A specific requirement for DEEPSPEED (zero stage 0) is mentioned in a comment, but the reason is not explained.
  • Missing Docstrings: The newly added classes and methods lack docstrings, which reduces code maintainability and understanding.
  • Redundant Logging: There are two identical logger.info calls logging self.vlm_model in LlavaLHT.build_model. (Note: This is a low severity issue and no specific comment was added per review settings).
  • Empty build_tokenizer: The LlavaLHT.build_tokenizer method is empty (pass). While the tokenizer is built in build_model, having an empty method might be slightly confusing without a comment explaining why it's empty. (Note: This is a low severity issue and no specific comment was added per review settings).

Merge Readiness

The pull request introduces support for a new model and its evaluation. The core logic seems to handle the specific requirements of the llava-lht model. However, there are several medium severity issues related to maintainability (monkey patching, complex device logic, missing docstrings) and potential extensibility (**kwargs assertion). I recommend addressing these issues before merging to ensure the code is more robust, easier to understand, and maintainable in the long term. I am unable to approve this pull request; please have other reviewers review and approve this code before merging.

@llmc-reviewer llmc-reviewer merged commit 671c271 into ModelTC:main May 22, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants