Skip to content

Commit 0ef6b8d

Browse files
committed
add readme
1 parent 2ea7f57 commit 0ef6b8d

File tree

2 files changed

+8
-3
lines changed

2 files changed

+8
-3
lines changed

recipes/quickstart/finetuning/finetune_vision_model.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ For **LoRA finetuning with FSDP**, we can run the following code:
2222

2323
For more details about the finetuning configurations, please read the [finetuning readme](./README.md).
2424

25+
For more details about local inference with the fine-tuned checkpoint, please read [Inference with FSDP checkpoints section](https://github.com/meta-llama/llama-recipes/tree/main/recipes/quickstart/inference/local_inference#inference-with-fsdp-checkpoints) to learn how to convert the FSDP weights into a consolidated Hugging Face formated model for local inference.
26+
2527
### How to use a custom dataset to fine-tune vision model
2628

2729
In order to use a custom dataset, please follow the steps below:

recipes/quickstart/inference/local_inference/README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,14 @@
11
# Local Inference
22

3+
## Hugging face setup
4+
**Important Note**: Before running the inference, you'll need your Hugging Face access token, which you can get at your Settings page [here](https://huggingface.co/settings/tokens). Then run `huggingface-cli login` and copy and paste your Hugging Face access token to complete the login to make sure the scripts can download Hugging Face models if needed.
5+
36
## Multimodal Inference
4-
For Multi-Modal inference we have added [multi_modal_infer.py](multi_modal_infer.py) which uses the transformers library
7+
For Multi-Modal inference we have added [multi_modal_infer.py](multi_modal_infer.py) which uses the transformers library.
58

6-
The way to run this would be
9+
The way to run this would be:
710
```
8-
python multi_modal_infer.py --image_path "./resources/image.jpg" --prompt_text "Describe this image" --temperature 0.5 --top_p 0.8 --model_name "meta-llama/Llama-3.2-11B-Vision-Instruct"
11+
python multi_modal_infer.py --image_path PATH_TO_IMAGE --prompt_text "Describe this image" --temperature 0.5 --top_p 0.8 --model_name "meta-llama/Llama-3.2-11B-Vision-Instruct"
912
```
1013

1114
## Text-only Inference

0 commit comments

Comments
 (0)