@@ -37,19 +37,19 @@ git clone https://huggingface.co/openai/clip-vit-large-patch14-336
37372 . Install the required Python packages:
3838
3939``` sh
40- pip install -r tools/llava /requirements.txt
40+ pip install -r tools/mtmd /requirements.txt
4141```
4242
43433 . Use ` llava_surgery.py ` to split the LLaVA model to LLaMA and multimodel projector constituents:
4444
4545``` sh
46- python ./tools/llava /llava_surgery.py -m ../llava-v1.5-7b
46+ python ./tools/mtmd /llava_surgery.py -m ../llava-v1.5-7b
4747```
4848
49494 . Use ` convert_image_encoder_to_gguf.py ` to convert the LLaVA image encoder to GGUF:
5050
5151``` sh
52- python ./tools/llava /convert_image_encoder_to_gguf.py -m ../clip-vit-large-patch14-336 --llava-projector ../llava-v1.5-7b/llava.projector --output-dir ../llava-v1.5-7b
52+ python ./tools/mtmd /convert_image_encoder_to_gguf.py -m ../clip-vit-large-patch14-336 --llava-projector ../llava-v1.5-7b/llava.projector --output-dir ../llava-v1.5-7b
5353```
5454
55555 . Use ` examples/convert_legacy_llama.py ` to convert the LLaMA part of LLaVA to GGUF:
@@ -69,12 +69,12 @@ git clone https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b
69692 ) Install the required Python packages:
7070
7171``` sh
72- pip install -r tools/llava /requirements.txt
72+ pip install -r tools/mtmd /requirements.txt
7373```
7474
75753 ) Use ` llava_surgery_v2.py ` which also supports llava-1.5 variants pytorch as well as safetensor models:
7676``` console
77- python tools/llava /llava_surgery_v2.py -C -m ../llava-v1.6-vicuna-7b/
77+ python tools/mtmd /llava_surgery_v2.py -C -m ../llava-v1.6-vicuna-7b/
7878```
7979- you will find a llava.projector and a llava.clip file in your model directory
8080
@@ -88,7 +88,7 @@ curl -s -q https://huggingface.co/cmp-nct/llava-1.6-gguf/raw/main/config_vit.jso
8888
89895 ) Create the visual gguf model:
9090``` console
91- python ./tools/llava /convert_image_encoder_to_gguf.py -m vit --llava-projector vit/llava.projector --output-dir vit --clip-model-is-vision
91+ python ./tools/mtmd /convert_image_encoder_to_gguf.py -m vit --llava-projector vit/llava.projector --output-dir vit --clip-model-is-vision
9292```
9393- This is similar to llava-1.5, the difference is that we tell the encoder that we are working with the pure vision model part of CLIP
9494
0 commit comments