Skip to content

Commit d290678

Browse files
committed
Improve discoverability of 3.2 recipes
1 parent 4e59dbd commit d290678

File tree

2 files changed

+12
-12
lines changed

2 files changed

+12
-12
lines changed

README.md

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,17 @@
22
<!-- markdown-link-check-disable -->
33
The 'llama-recipes' repository is a companion to the [Meta Llama](https://github.com/meta-llama/llama-models) models. We support the latest version, [Llama 3.2 Vision](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD_VISION.md) and [Llama 3.2 Text](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD.md), in this repository. This repository contains example scripts and notebooks to get started with the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here use Llama locally, in the cloud, and on-prem.
44

5+
> [!TIP]
6+
> Get started with Llama 3.2 with these new recipes:
7+
> * [Finetune Llama 3.2 Vision](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/finetuning/finetune_vision_model.md)
8+
> * [Multimodal Inference with Llama 3.2 Vision](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/inference/local_inference/README.md#multimodal-inference)
9+
> * [Inference on Llama Guard 1B + Multimodal inference on Llama Guard 11B-Vision](https://github.com/meta-llama/llama-recipes/blob/main/recipes/responsible_ai/llama_guard/llama_guard_text_and_vision_inference.ipynb)
10+
11+
512
<!-- markdown-link-check-enable -->
6-
> [!IMPORTANT]
13+
> [!NOTE]
714
> Llama 3.2 follows the same prompt template as Llama 3.1, with a new special token `<|image|>` representing the input image for the multimodal models.
815
>
9-
> | Token | Description |
10-
> |---|---|
11-
> `<\|begin_of_text\|>` | Specifies the start of the prompt. |
12-
> `<\|image\|>` | Represents the image tokens passed as an input to Llama. |
13-
> `<\|eot_id\|>` | This token signifies the end of a turn i.e. the end of the model's interaction either with the user or tool executor. |
14-
> `<\|eom_id\|>` | End of Message. A message represents a possible stopping point where the model can inform the execution environment that a tool call needs to be made. |
15-
> `<\|python_tag\|>` | A special tag used in the model’s response to signify a tool call. |
16-
> `<\|finetune_right_pad_id\|>` | Used for padding text sequences in a batch to the same length. |
17-
> `<\|start_header_id\|>{role}<\|end_header_id\|>` | These tokens enclose the role for a particular message. The possible roles can be: system, user, assistant and ipython. |
18-
> `<\|end_of_text\|>` | This is equivalent to the EOS token. For multiturn-conversations it's usually unused, this token is expected to be generated only by the base models. |
19-
>
2016
> More details on the prompt templates for image reasoning, tool-calling and code interpreter can be found [on the documentation website](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_2).
2117
2218

recipes/quickstart/inference/local_inference/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,17 @@
11
# Local Inference
22

3+
## Multimodal Inference
34
For Multi-Modal inference we have added [multi_modal_infer.py](multi_modal_infer.py) which uses the transformers library
45

56
The way to run this would be
67
```
78
python multi_modal_infer.py --image_path "./resources/image.jpg" --prompt_text "Describe this image" --temperature 0.5 --top_p 0.8 --model_name "meta-llama/Llama-3.2-11B-Vision-Instruct"
89
```
910

11+
## Text-only Inference
1012
For local inference we have provided an [inference script](inference.py). Depending on the type of finetuning performed during training the [inference script](inference.py) takes different arguments.
13+
14+
1115
To finetune all model parameters the output dir of the training has to be given as --model_name argument.
1216
In the case of a parameter efficient method like lora the base model has to be given as --model_name and the output dir of the training has to be given as --peft_model argument.
1317
Additionally, a prompt for the model in the form of a text file has to be provided. The prompt file can either be piped through standard input or given as --prompt_file parameter.

0 commit comments

Comments
 (0)