Skip to content

Commit b273a75

Browse files
committed
* Add new readmes
* Move prompt-engineering notebook to folder * Delete unnecessary samsum file
1 parent 0634c43 commit b273a75

File tree

4 files changed

+38
-3
lines changed

4 files changed

+38
-3
lines changed

recipes/inference/local_inference/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ python inference.py --model_name <training_config.output_dir> --peft_model <trai
6161

6262
```
6363

64-
## Loading back FSDP checkpoints
64+
## Inference with FSDP checkpoints
6565

6666
In case you have fine-tuned your model with pure FSDP and saved the checkpoints with "SHARDED_STATE_DICT" as shown [here](../../../src/llama_recipes/configs/fsdp.py), you can use this converter script to convert the FSDP Sharded checkpoints into HuggingFace checkpoints. This enables you to use the inference script normally as mentioned above.
6767
**To convert the checkpoint use the following command**:
@@ -82,6 +82,6 @@ By default, training parameter are saved in `train_params.yaml` in the path wher
8282
Then run inference using:
8383

8484
```bash
85-
python inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file>
85+
python inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file>
8686

87-
```
87+
```

recipes/quickstart/README.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
## Llama-Recipes Quickstart
2+
3+
If you are new to developing with Meta Llama models, this is where you should start. This folder contains introductory-level notebooks across different techniques relating to Meta Llama.
4+
5+
* The [](./Running_Llama3_Anywhere/) notebooks demonstrate how to run Llama inference across Linux, Mac and Windows platforms using the appropriate tooling.
6+
* The [](./prompt_engineering/Prompt_Engineering_with_Llama_3.ipynb) notebook showcases the various ways to elicit appropriate outputs from Llama. Take this notebook for a spin to get a feel for how Llama responds to different inputs and generation parameters.
7+
* The [](./inference/) folder contains scripts to deploy Llama for inference on server and mobile. See also [](../3p_integrations/vllm/) and [](../3p_integrations/tgi/) for hosting Llama on open-source model servers.
8+
* The [](./RAG/) folder contains a simple Retrieval-Augmented Generation application using Llama 3.
9+
* The [](./finetuning/) folder contains resources to help you finetune Llama 3 on your custom datasets, for both single- and multi-GPU setups. The scripts use the native llama-recipes finetuning code found in [](../../src/llama_recipes/finetuning.py) which supports these features:
10+
11+
| Feature | |
12+
| ---------------------------------------------- | - |
13+
| HF support for finetuning ||
14+
| Deferred initialization ( meta init) ||
15+
| HF support for inference ||
16+
| Low CPU mode for multi GPU ||
17+
| Mixed precision ||
18+
| Single node quantization ||
19+
| Flash attention ||
20+
| PEFT ||
21+
| Activation checkpointing FSDP ||
22+
| Hybrid Sharded Data Parallel (HSDP) ||
23+
| Dataset packing & padding ||
24+
| BF16 Optimizer ( Pure BF16) ||
25+
| Gradient accumulation ||
26+
| CPU offloading ||
27+
| FSDP checkpoint conversion to HF for inference ||
28+
| W&B experiment tracker ||
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
## Quickstart > Inference
2+
3+
This folder contains scripts to get you started with inference on Meta Llama models.
4+
5+
* [](./code_llama/) contains scripts for tasks relating to code generation using CodeLlama
6+
* [](./local_inference/) contsin scripts to do memory efficient inference on servers and local machines
7+
* [](./mobile_inference/) has scripts using MLC to serve Llama on Android (h/t to OctoAI for the contribution!)

0 commit comments

Comments
 (0)