Hi, when I run
# Image Understanding CUDA_VISIBLE_DEVICES=0 python llavamini/eval/run_llava_mini.py \ --model-path ICTNLP/llava-mini-llama-3.1-8b \ --image-file llavamini/serve/examples/baby_cake.png \ --conv-mode llava_llama_3_1 --model-name "llava-mini" \ --query "What's the text on the cake?"
I got the following error:
Some weights of the model checkpoint at ICTNLP/llava-mini-llama-3.1-8b were not used when initializing LlavaMiniLlamaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', ...
It seems ICTNLP/llava-mini-llama-3.1-8b has additional parameters in the form of 'model.vision_tower.vision_tower...'. Why is this case and is this a problem?