support llava-lht（test：quantize-awq, MME-pretrain) #379

zhangbilang · 2025-05-22T16:40:08Z

No description provided.

gemini-code-assist

Hello @SmudgedWings, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello, Gemini here! I've reviewed this pull request from @SmudgedWings. The primary goal of this PR, based on the title, is to introduce support for the Llava-LHT model within the framework, with a focus on enabling testing for quantization methods like AWQ and potentially MME-pretrain evaluations. The changes involve adding a new model implementation specifically tailored for Llava-LHT and integrating it into the existing model handling and evaluation infrastructure. Additionally, some runtime dependencies have been updated.

Highlights

Add Llava-LHT Model Support: Introduces a new model class LlavaLHT and its evaluation counterpart LlavaLHTEval to integrate the Llava-LHT vision-language model into the framework.
Handle Llava-LHT Specifics: Implements custom methods in LlavaLHT to correctly load the model, identify its vision and language components, and adapt its forward and generate methods to work around specific behaviors (e.g., handling cache_position, input_ids/inputs, and attention_mask). This ensures compatibility with the framework's processing and evaluation pipelines.
Update Dependencies: Increments the minimum required PyTorch version to 2.2.0 and adjusts the listing of the lmms-eval dependency in the runtime requirements file.

Changelog

Click here to see the changelog

llmc/models/init.py
- Added import for the new LlavaLHT model class (line 13).
llmc/models/llava_lht.py
- Added a new file containing the LlavaLHT and LlavaLHTEval classes.
- Implemented LlavaLHT inheriting from Llama, with custom methods (build_model, get_extra_rot_module_besides_embed_layers, find_blocks, get_layernorms_in_block, get_subsets_in_block) to integrate the Llava-LHT architecture.
- Added workarounds for vlm_model.forward, vlm_model.generate, and vlm_model.prepare_inputs_for_generation to handle specific model behaviors.
- Implemented LlavaLHTEval inheriting from lmms_eval.models.llava.Llava to handle evaluation specifics for Llava-LHT.
requirements/runtime.txt
- Updated the minimum required PyTorch version from 2.1.0 to 2.2.0 (line 1).
- Added lmms-eval to the main dependency list (line 7).
- Removed a duplicate or misplaced entry for lmms-eval (line 34).

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Image meets the text,
Llava sees and understands,
New model arrives.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for the llava-lht model, including its evaluation counterpart. The changes involve adding a new model class LlavaLHT and an evaluation class LlavaLHTEval, along with updating the model registry and runtime requirements. The implementation includes custom handling for the llava-lht model's specific behaviors, such as parameter naming and generation arguments. Overall, the code adds necessary components for integrating this new model.

Summary of Findings

Monkey Patching: The implementation uses monkey patching to modify core model methods (forward, generate, prepare_inputs_for_generation). This approach can be fragile and less maintainable compared to using hooks or wrapper classes.
Device Mapping Logic: The device assignment logic in LlavaLHTEval.__init__ appears complex and could potentially be simplified for clarity and robustness.
Disallowing **kwargs: The LlavaLHTEval constructor strictly disallows any extra keyword arguments, which might limit future extensibility.
DEEPSPEED Configuration Requirement: A specific requirement for DEEPSPEED (zero stage 0) is mentioned in a comment, but the reason is not explained.
Missing Docstrings: The newly added classes and methods lack docstrings, which reduces code maintainability and understanding.
Redundant Logging: There are two identical logger.info calls logging self.vlm_model in LlavaLHT.build_model. (Note: This is a low severity issue and no specific comment was added per review settings).
Empty build_tokenizer: The LlavaLHT.build_tokenizer method is empty (pass). While the tokenizer is built in build_model, having an empty method might be slightly confusing without a comment explaining why it's empty. (Note: This is a low severity issue and no specific comment was added per review settings).

Merge Readiness

The pull request introduces support for a new model and its evaluation. The core logic seems to handle the specific requirements of the llava-lht model. However, there are several medium severity issues related to maintainability (monkey patching, complex device logic, missing docstrings) and potential extensibility (**kwargs assertion). I recommend addressing these issues before merging to ensure the code is more robust, easier to understand, and maintainable in the long term. I am unable to approve this pull request; please have other reviewers review and approve this code before merging.

llmc/models/llava_lht.py

support llava-lht（test：quantize-awq, MME-pretrain)

179a067

gemini-code-assist bot reviewed May 22, 2025

View reviewed changes

gemini-code-assist bot suggested changes May 22, 2025

View reviewed changes

llmc/models/llava_lht.py Show resolved Hide resolved

llmc/models/llava_lht.py Show resolved Hide resolved

llmc/models/llava_lht.py Show resolved Hide resolved

llmc/models/llava_lht.py Show resolved Hide resolved

llmc/models/llava_lht.py Show resolved Hide resolved

llmc-reviewer approved these changes May 22, 2025

View reviewed changes

llmc-reviewer merged commit 671c271 into ModelTC:main May 22, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support llava-lht（test：quantize-awq, MME-pretrain) #379

support llava-lht（test：quantize-awq, MME-pretrain) #379

Uh oh!

zhangbilang commented May 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

support llava-lht（test：quantize-awq, MME-pretrain) #379

support llava-lht（test：quantize-awq, MME-pretrain) #379

Uh oh!

Conversation

zhangbilang commented May 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants