Skip to content

Commit 57afa0b

Browse files
committed
use AutoModel
1 parent 2730bca commit 57afa0b

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

recipes/quickstart/finetuning/finetune_vision_model.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
## Fine-Tuning Meta Llama Multi Modal Models recipe
22
This recipe steps you through how to finetune a Llama 3.2 vision model on the OCR VQA task using the [OCRVQA](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron/viewer/ocrvqa?row=0) dataset.
33

4-
**Disclaimer**: As our vision models already have a very good OCR ability, here we just use the OCRVQA dataset to demonstrate the steps needed for fine-tuning our vision models.
4+
**Disclaimer**: As our vision models already have a very good OCR ability, here we just use the OCRVQA dataset only for demonstration purposes of the required steps for fine-tuning our vision models with llama-recipes.
55

66
### Fine-tuning steps
77

src/llama_recipes/finetuning.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@
2020
AutoConfig,
2121
AutoTokenizer,
2222
BitsAndBytesConfig,
23-
LlamaForCausalLM,
2423
AutoProcessor,
25-
MllamaForConditionalGeneration
24+
MllamaForConditionalGeneration,
25+
AutoModel,
2626
)
2727
from transformers.models.llama.modeling_llama import LlamaDecoderLayer
2828
from transformers.models.mllama.modeling_mllama import MllamaSelfAttentionDecoderLayer,MllamaCrossAttentionDecoderLayer,MllamaVisionEncoderLayer
@@ -134,7 +134,7 @@ def main(**kwargs):
134134
processor.tokenizer.padding_side='right'
135135
elif config.model_type == "llama":
136136
is_vision = False
137-
model = LlamaForCausalLM.from_pretrained(
137+
model = AutoModel.from_pretrained(
138138
train_config.model_name,
139139
quantization_config=bnb_config,
140140
use_cache=use_cache,

0 commit comments

Comments
 (0)