-
Notifications
You must be signed in to change notification settings - Fork 66
llava renamed to llava-hf, and llava-lht renamed to llava #381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @SmudgedWings, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team,
Gemini here, providing a summary of this pull request to help everyone quickly understand the changes and context.
The primary goal of this PR, as indicated by the title and description, is to swap the names of the two existing LLaVA model implementations within the codebase. Specifically, the implementation previously named llava (which used the Hugging Face transformers library) has been renamed to llava-hf, and the implementation previously named llava-lht (which used the original llava library from liuhaotian) has been renamed to llava.
This change involves renaming the corresponding Python files and updating class names and import paths throughout the relevant modules.
Highlights
- Class Renaming: Within the code, the main model class previously named
LlavaLHTis nowLlava, and the evaluation classLlavaLHTEvalis nowLlavaEval. The classes for the Hugging Face implementation retain theirLlavaHfandLlavaHfEvalnames but are now located in thellava_hf.pyfile. - Import Updates: Import statements in
llmc/models/__init__.pyand within the model files themselves have been updated to reflect the new file and class names. - Code Adjustments in Renamed
llava.py: Minor code adjustments were made in the file now namedllmc/models/llava.py(previouslyllava_lht.py) to align with the structure of the original LLaVA library, including changes to model loading, attribute access (e.g.,vision_projector.linear_2tovision_projector[2]), and the removal of thebatch_processmethod.
Changelog
Click here to see the changelog
- llmc/models/init.py
- Updated import on line 13 to reference
llava_hfinstead ofllava_lht.
- Updated import on line 13 to reference
- llmc/models/llava.py
- This file now contains the code previously found in
llmc/models/llava_lht.py. - Class
LlavaLHTrenamed toLlava(line 29). - Updated imports to use components from the original
llavalibrary (lines 18-25). - Model loading logic updated to use
llava.model.builder.load_pretrained_model(lines 47-56). - Added compatibility wrappers for
forward,generate, andprepare_inputs_for_generationmethods (lines 58-88). - Changed
eval_namefrom'LlavaLHTEval'to'LlavaEval'(line 90). - Updated access to
vision_modelandvision_projectorattributes (lines 93-94). - Removed
is_video_modelfrompruning_config(line 98). - Removed the
batch_processmethod. - Updated access in
get_extra_rot_module_besides_embed_layersfromvision_projector.linear_2tovision_projector[2](line 108). - Updated access in
find_blocksfor vision modality (line 114). - Class
LlavaLHTEvalrenamed toLlavaEvaland now inherits fromlmms_eval.models.llava.Llava(aliased asLLaVA) (line 181).
- This file now contains the code previously found in
- llmc/models/llava_hf.py
- This is a new file containing the code previously found in
llmc/models/llava.py. - Class
Llavarenamed toLlavaHf(line 19). - Class
LlavaHfEvalremainsLlavaHfEvaland inherits fromlmms_eval.models.llava_hf.LlavaHf(aliased asLlavaHF) (line 177).
- This is a new file containing the code previously found in
- llmc/models/llava_lht.py
- This file has been removed.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A model name, a simple swap,
From LHT to just Llava, no flap.
Hugging Face gets its own tag,
No more confusion, no more lag.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively refactors the LLaVA model implementations by renaming llava to llava-hf (for the Hugging Face Transformers-based version) and llava-lht to llava (for the version based on Liu HaoTian's LLaVA repository). The changes are quite extensive and appear to correctly separate and update the two model variants.
The new llava.py (formerly llava_lht.py) now integrates more directly with the LLaVA library, including monkey-patching for compatibility and an updated LlavaEval class. The new llava_hf.py correctly encapsulates the previous Hugging Face-based LLaVA model and its evaluation class.
Overall, the renaming and restructuring are clear and consistent. I've identified a couple of areas for minor improvement, primarily a recurring typo related to _world_size.
Summary of Findings
- Variable Naming Inconsistency (
_word_sizevs_world_size): Inllmc/models/llava.py(new line 299) andllmc/models/llava_hf.py(lines 257, 262),self._word_sizeis used whereself._world_sizeis expected and used in other parts of the same conditional blocks. This should be standardized toself._world_sizefor consistency and correctness. (Severity: medium, comments provided) - Clarity of
build_tokenizerOverride: Inllmc/models/llava.py,build_tokenizer(self): passoverrides the base class method. Whileself.tokenizeris set inbuild_model, a brief comment explaining why this override is necessary or preferred (e.g., tokenizer is fully handled byload_pretrained_model) would improve code clarity. (Severity: low, not commented due to review settings) - Unusually Long Timeout for
InitProcessGroupKwargs: Inllmc/models/llava.py,InitProcessGroupKwargs(timeout=timedelta(weeks=52))specifies an extremely long timeout. While this prevents premature timeouts, a comment explaining this choice or using a more conventional large value might be clearer. (Severity: low, not commented due to review settings) - Consistency in
pruning_config: Thepruning_configinllava.py(LHT version) omits'is_video_model', whilellava_hf.pyincludes'is_video_model': False. Ensuring consistency or commenting on this difference might be beneficial if both models have similar video handling capabilities (or lack thereof) in this framework. (Severity: low, not commented due to review settings)
Merge Readiness
The pull request makes significant and positive structural changes by renaming and reorganizing the LLaVA model implementations. The core logic appears to be preserved and updated appropriately for each variant.
However, there are a few instances of a typo (_word_size instead of _world_size) that should be corrected for consistency and to prevent potential runtime issues. These are marked with medium severity.
I recommend addressing these naming inconsistencies before merging. Once these are resolved, the PR should be in good shape. As an AI, I am not authorized to approve pull requests; please ensure further review and approval by authorized team members.
| ) | ||
| logger.info(f'Using {accelerator.num_processes} devices with tensor parallelism') | ||
| self._rank = 0 | ||
| self._word_size = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a slight inconsistency in variable naming here. In other parts of this conditional logic (e.g., new line 295 and 304), self._world_size is used, which is the standard term. For consistency and to avoid potential AttributeError if other code expects _world_size, should this be self._world_size = 1 as well?
| self._word_size = 1 | |
| self._world_size = 1 |
| f'Using {accelerator.num_processes} devices with pipeline parallelism' | ||
| ) | ||
| self._rank = 0 | ||
| self._word_size = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears there's a typo here. The standard attribute name is _world_size, which is used on line 251 (self._world_size = self.accelerator.num_processes). For consistency and to prevent potential issues if other parts of the system expect _world_size, should this be self._world_size = 1?
| self._word_size = 1 | |
| self._world_size = 1 |
| logger.info(f'Using single device: {self._device}') | ||
| self.model.to(self._device) | ||
| self._rank = 0 | ||
| self._word_size = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llava has been renamed to llava-hf, and llava-lht is now renamed to llava.