You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-43Lines changed: 25 additions & 43 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,47 +1,25 @@
1
1
# Llama Recipes: Examples to get started using the Llama models from Meta
2
2
<!-- markdown-link-check-disable -->
3
-
The 'llama-recipes' repository is a companion to the [Meta Llama](https://github.com/meta-llama/llama-models) models. We support the latest version, [Llama 3.1](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md), in this repository. The goal is to provide a scalable library for fine-tuning Meta Llama models, along with some example scripts and notebooks to quickly get started with using the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here showcase how to run Llama locally, in the cloud, and on-prem.
3
+
The 'llama-recipes' repository is a companion to the [Meta Llama](https://github.com/meta-llama/llama-models) models. We support the latest version, [Llama 3.2 Vision](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD_VISION.md) and [Llama 3.2 Text](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD.md), in this repository. This repository contains example scripts and notebooks to get started with the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here use Llama locally, in the cloud, and on-prem.
4
+
5
+
> [!TIP]
6
+
> Get started with Llama 3.2 with these new recipes:
> *[Multimodal Inference with Llama 3.2 Vision](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/inference/local_inference/README.md#multimodal-inference)
9
+
> *[Inference on Llama Guard 1B + Multimodal inference on Llama Guard 11B-Vision](https://github.com/meta-llama/llama-recipes/blob/main/recipes/responsible_ai/llama_guard/llama_guard_text_and_vision_inference.ipynb)
10
+
4
11
5
12
<!-- markdown-link-check-enable -->
6
-
> [!IMPORTANT]
7
-
> Meta Llama 3.1 has a new prompt template and special tokens.
8
-
> | Token | Description |
9
-
> |---|---|
10
-
> `<\|begin_of_text\|>` | Specifies the start of the prompt. |
11
-
> `<\|eot_id\|>` | This token signifies the end of a turn i.e. the end of the model's interaction either with the user or tool executor. |
12
-
> `<\|eom_id\|>` | End of Message. A message represents a possible stopping point where the model can inform the execution environment that a tool call needs to be made. |
13
-
> `<\|python_tag\|>` | A special tag used in the model’s response to signify a tool call. |
14
-
> `<\|finetune_right_pad_id\|>` | Used for padding text sequences in a batch to the same length. |
15
-
> `<\|start_header_id\|>{role}<\|end_header_id\|>` | These tokens enclose the role for a particular message. The possible roles can be: system, user, assistant and ipython. |
16
-
> `<\|end_of_text\|>` | This is equivalent to the EOS token. For multiturn-conversations it's usually unused, this token is expected to be generated only by the base models. |
17
-
>
18
-
> A multiturn-conversation with Meta Llama 3.1 that includes tool-calling follows this structure:
> Each message gets trailed by an `<|eot_id|>` token before a new header is started, signaling a role change.
33
-
>
34
-
> More details on the new tokenizer and prompt template can be found [here](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1).
35
-
36
-
>
37
13
> [!NOTE]
38
-
> The llama-recipes repository was recently refactored to promote a better developer experience of using the examples. Some files have been moved to new locations. The `src/` folder has NOT been modified, so the functionality of this repo and package is not impacted.
39
-
>
40
-
> Make sure you update your local clone by running `git pull origin main`
14
+
> Llama 3.2 follows the same prompt template as Llama 3.1, with a new special token `<|image|>` representing the input image for the multimodal models.
15
+
>
16
+
> More details on the prompt templates for image reasoning, tool-calling and code interpreter can be found [on the documentation website](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_2).
17
+
18
+
41
19
42
20
## Table of Contents
43
21
44
-
- [Llama Recipes: Examples to get started using the Meta Llama models from Meta](#llama-recipes-examples-to-get-started-using-the-llama-models-from-meta)
22
+
-[Llama Recipes: Examples to get started using the Llama models from Meta](#llama-recipes-examples-to-get-started-using-the-llama-models-from-meta)
45
23
-[Table of Contents](#table-of-contents)
46
24
-[Getting Started](#getting-started)
47
25
-[Prerequisites](#prerequisites)
@@ -94,6 +72,10 @@ To use the sensitive topics safety checker install with:
94
72
```
95
73
pip install llama-recipes[auditnlg]
96
74
```
75
+
Some recipes require the presence of langchain. To install the packages follow the recipe description or install with:
76
+
```
77
+
pip install llama-recipes[langchain]
78
+
```
97
79
Optional dependencies can also be combines with [option1,option2].
You can find Meta Llama models on Hugging Face hub [here](https://huggingface.co/meta-llama), **where models with `hf` in the name are already converted to Hugging Face checkpoints so no further conversion is needed**. The conversion step below is only for original model weights from Meta that are hosted on Hugging Face model hub as well.
98
+
### Getting the Llama models
99
+
You can find Llama models on Hugging Face hub [here](https://huggingface.co/meta-llama), **where models with `hf` in the name are already converted to Hugging Face checkpoints so no further conversion is needed**. The conversion step below is only for original model weights from Meta that are hosted on Hugging Face model hub as well.
118
100
119
101
#### Model conversion to Hugging Face
120
-
The recipes and notebooks in this folder are using the Meta Llama model definition provided by Hugging Face's transformers library.
121
-
122
-
Given that the original checkpoint resides under models/7B you can install all requirements and convert the checkpoint with:
102
+
If you have the model checkpoints downloaded from the Meta website, you can convert it to the Hugging Face format with:
123
103
124
104
```bash
125
105
## Install Hugging Face Transformers from source
126
-
pip freeze | grep transformers ## verify it is version 4.31.0 or higher
106
+
pip freeze | grep transformers ## verify it is version 4.45.0 or higher
@@ -192,6 +172,8 @@ Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduc
192
172
## License
193
173
<!-- markdown-link-check-disable -->
194
174
175
+
See the License file for Meta Llama 3.2 [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) and Acceptable Use Policy [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/USE_POLICY.md)
176
+
195
177
See the License file for Meta Llama 3.1 [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) and Acceptable Use Policy [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/USE_POLICY.md)
196
178
197
179
See the License file for Meta Llama 3 [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3/LICENSE) and Acceptable Use Policy [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3/USE_POLICY.md)
description = "Llama-recipes is a companion project to the Llama 2 model. It's goal is to provide examples to quickly get started with fine-tuning for domain adaptation and how to run inference for the fine-tuned models."
13
+
description = "Llama-recipes is a companion project to the Llama models. It's goal is to provide examples to quickly get started with fine-tuning for domain adaptation and how to run inference for the fine-tuned models."
This recipe steps you through how to finetune a Llama 3.2 vision model on the OCR VQA task using the [OCRVQA](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron/viewer/ocrvqa?row=0) dataset.
3
+
4
+
**Disclaimer**: As our vision models already have a very good OCR ability, here we use the OCRVQA dataset only for demonstration purposes of the required steps for fine-tuning our vision models with llama-recipes.
5
+
6
+
### Fine-tuning steps
7
+
8
+
We created an example script [ocrvqa_dataset.py](./datasets/ocrvqa_dataset.py) that can load the OCRVQA dataset with `get_custom_dataset` function, then provide OCRVQADataCollator class to process the image dataset.
9
+
10
+
For **full finetuning with FSDP**, we can run the following code:
**Note**: `--batching_strategy padding` is needed as the vision model will not work with `packing` method.
22
+
23
+
For more details about the finetuning configurations, please read the [finetuning readme](./README.md).
24
+
25
+
### How to use a custom dataset to fine-tune vision model
26
+
27
+
In order to use a custom dataset, please follow the steps below:
28
+
29
+
1. Create a new dataset python file under `recipes/quickstart/finetuning/dataset` folder.
30
+
2. In this python file, you need to define a `get_custom_dataset(dataset_config, processor, split, split_ratio=0.9)` function that handles the data loading.
31
+
3. In this python file, you need to define a `get_data_collator(processor)` that returns a custom data collator that can be used by the Pytorch Data Loader.
32
+
4. This custom data collator class must have a `__call__(self, samples)` function that converts the image and text samples into the actual inputs that vision model expects.
33
+
5. Run the `torchrun` commend from above section, please change the `--custom_dataset.file` to the new dataset python file, adjust the learning rate accordingly.
0 commit comments