Skip to content

Conversation

@zhangbilang
Copy link
Collaborator

llava has been renamed to llava-hf, and llava-lht is now renamed to llava.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @SmudgedWings, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello team,

Gemini here, providing a summary of this pull request to help everyone quickly understand the changes and context.

The primary goal of this PR, as indicated by the title and description, is to swap the names of the two existing LLaVA model implementations within the codebase. Specifically, the implementation previously named llava (which used the Hugging Face transformers library) has been renamed to llava-hf, and the implementation previously named llava-lht (which used the original llava library from liuhaotian) has been renamed to llava.

This change involves renaming the corresponding Python files and updating class names and import paths throughout the relevant modules.

Highlights

  • Class Renaming: Within the code, the main model class previously named LlavaLHT is now Llava, and the evaluation class LlavaLHTEval is now LlavaEval. The classes for the Hugging Face implementation retain their LlavaHf and LlavaHfEval names but are now located in the llava_hf.py file.
  • Import Updates: Import statements in llmc/models/__init__.py and within the model files themselves have been updated to reflect the new file and class names.
  • Code Adjustments in Renamed llava.py: Minor code adjustments were made in the file now named llmc/models/llava.py (previously llava_lht.py) to align with the structure of the original LLaVA library, including changes to model loading, attribute access (e.g., vision_projector.linear_2 to vision_projector[2]), and the removal of the batch_process method.

Changelog

Click here to see the changelog
  • llmc/models/init.py
    • Updated import on line 13 to reference llava_hf instead of llava_lht.
  • llmc/models/llava.py
    • This file now contains the code previously found in llmc/models/llava_lht.py.
    • Class LlavaLHT renamed to Llava (line 29).
    • Updated imports to use components from the original llava library (lines 18-25).
    • Model loading logic updated to use llava.model.builder.load_pretrained_model (lines 47-56).
    • Added compatibility wrappers for forward, generate, and prepare_inputs_for_generation methods (lines 58-88).
    • Changed eval_name from 'LlavaLHTEval' to 'LlavaEval' (line 90).
    • Updated access to vision_model and vision_projector attributes (lines 93-94).
    • Removed is_video_model from pruning_config (line 98).
    • Removed the batch_process method.
    • Updated access in get_extra_rot_module_besides_embed_layers from vision_projector.linear_2 to vision_projector[2] (line 108).
    • Updated access in find_blocks for vision modality (line 114).
    • Class LlavaLHTEval renamed to LlavaEval and now inherits from lmms_eval.models.llava.Llava (aliased as LLaVA) (line 181).
  • llmc/models/llava_hf.py
    • This is a new file containing the code previously found in llmc/models/llava.py.
    • Class Llava renamed to LlavaHf (line 19).
    • Class LlavaHfEval remains LlavaHfEval and inherits from lmms_eval.models.llava_hf.LlavaHf (aliased as LlavaHF) (line 177).
  • llmc/models/llava_lht.py
    • This file has been removed.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A model name, a simple swap,
From LHT to just Llava, no flap.
Hugging Face gets its own tag,
No more confusion, no more lag.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the LLaVA model implementations by renaming llava to llava-hf (for the Hugging Face Transformers-based version) and llava-lht to llava (for the version based on Liu HaoTian's LLaVA repository). The changes are quite extensive and appear to correctly separate and update the two model variants.

The new llava.py (formerly llava_lht.py) now integrates more directly with the LLaVA library, including monkey-patching for compatibility and an updated LlavaEval class. The new llava_hf.py correctly encapsulates the previous Hugging Face-based LLaVA model and its evaluation class.

Overall, the renaming and restructuring are clear and consistent. I've identified a couple of areas for minor improvement, primarily a recurring typo related to _world_size.

Summary of Findings

  • Variable Naming Inconsistency (_word_size vs _world_size): In llmc/models/llava.py (new line 299) and llmc/models/llava_hf.py (lines 257, 262), self._word_size is used where self._world_size is expected and used in other parts of the same conditional blocks. This should be standardized to self._world_size for consistency and correctness. (Severity: medium, comments provided)
  • Clarity of build_tokenizer Override: In llmc/models/llava.py, build_tokenizer(self): pass overrides the base class method. While self.tokenizer is set in build_model, a brief comment explaining why this override is necessary or preferred (e.g., tokenizer is fully handled by load_pretrained_model) would improve code clarity. (Severity: low, not commented due to review settings)
  • Unusually Long Timeout for InitProcessGroupKwargs: In llmc/models/llava.py, InitProcessGroupKwargs(timeout=timedelta(weeks=52)) specifies an extremely long timeout. While this prevents premature timeouts, a comment explaining this choice or using a more conventional large value might be clearer. (Severity: low, not commented due to review settings)
  • Consistency in pruning_config: The pruning_config in llava.py (LHT version) omits 'is_video_model', while llava_hf.py includes 'is_video_model': False. Ensuring consistency or commenting on this difference might be beneficial if both models have similar video handling capabilities (or lack thereof) in this framework. (Severity: low, not commented due to review settings)

Merge Readiness

The pull request makes significant and positive structural changes by renaming and reorganizing the LLaVA model implementations. The core logic appears to be preserved and updated appropriately for each variant.

However, there are a few instances of a typo (_word_size instead of _world_size) that should be corrected for consistency and to prevent potential runtime issues. These are marked with medium severity.

I recommend addressing these naming inconsistencies before merging. Once these are resolved, the PR should be in good shape. As an AI, I am not authorized to approve pull requests; please ensure further review and approval by authorized team members.

)
logger.info(f'Using {accelerator.num_processes} devices with tensor parallelism')
self._rank = 0
self._word_size = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There seems to be a slight inconsistency in variable naming here. In other parts of this conditional logic (e.g., new line 295 and 304), self._world_size is used, which is the standard term. For consistency and to avoid potential AttributeError if other code expects _world_size, should this be self._world_size = 1 as well?

Suggested change
self._word_size = 1
self._world_size = 1

f'Using {accelerator.num_processes} devices with pipeline parallelism'
)
self._rank = 0
self._word_size = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It appears there's a typo here. The standard attribute name is _world_size, which is used on line 251 (self._world_size = self.accelerator.num_processes). For consistency and to prevent potential issues if other parts of the system expect _world_size, should this be self._world_size = 1?

Suggested change
self._word_size = 1
self._world_size = 1

logger.info(f'Using single device: {self._device}')
self.model.to(self._device)
self._rank = 0
self._word_size = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the comment on line 257, this attribute seems to be mistyped. It should likely be self._world_size = 1 to align with the standard naming convention and usage elsewhere (e.g., line 251).

Suggested change
self._word_size = 1
self._world_size = 1

@llmc-reviewer llmc-reviewer merged commit 4683d0c into ModelTC:main May 23, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants