Skip to content

Commit 2ba5de7

Browse files
rasbtawaelchlicarmocca
committed
New README (#1099)
Co-authored-by: awaelchli <[email protected]> Co-authored-by: Carlos Mocholí <[email protected]>
1 parent e2111af commit 2ba5de7

File tree

7 files changed

+329
-110
lines changed

7 files changed

+329
-110
lines changed

README.md

Lines changed: 256 additions & 110 deletions
Large diffs are not rendered by default.

images/1.webp

45.1 KB
Loading

images/2.webp

22.7 KB
Loading

images/3.webp

19.4 KB
Loading

images/4.webp

24.3 KB
Loading

tutorials/download_model_weights.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,32 @@
22

33
LitGPT supports a variety of LLM architectures with publicly available weights. You can download model weights and access a list of supported models using the LitGPT `download.py` script.
44

5+
6+
| Model | Model size | Reference |
7+
|----------------------------------------------|------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
8+
| Code Llama by Meta AI | 7B, 13B, 34B, 70B | [Rozière et al. 2023](https://arxiv.org/abs/2308.12950) |
9+
| Dolly by Databricks | 3B, 7B, 12B | [Conover et al. 2023](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm) |
10+
| Falcon by TII UAE | 7B, 40B, 180B | [TII 2023](https://falconllm.tii.ae) |
11+
| FreeWilly2 (Stable Beluga 2) by Stability AI | 70B | [Stability AI 2023](https://stability.ai/blog/stable-beluga-large-instruction-fine-tuned-models) |
12+
| Function Calling Llama 2 by Trelis | 7B | [Trelis et al. 2023](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2) |
13+
| Gemma by Google | 2B, 7B | [Google Team, Google Deepmind](https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf) |
14+
| Llama 2 by Meta AI | 7B, 13B, 70B | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288) |
15+
| LongChat by LMSYS | 7B, 13B | [LongChat Team 2023](https://lmsys.org/blog/2023-06-29-longchat/) |
16+
| Mistral and Mixtral by Mistral AI | 7B | [Mistral website](https://mistral.ai/) |
17+
| Nous-Hermes by NousResearch | 7B, 13B, 70B | [Org page](https://huggingface.co/NousResearch) |
18+
| OpenLLaMA by OpenLM Research | 3B, 7B, 13B | [Geng & Liu 2023](https://github.com/openlm-research/open_llama) |
19+
| Phi by Microsoft Research | 1.3B, 2.7B | [Li et al. 2023](https://arxiv.org/abs/2309.05463) |
20+
| Platypus by Lee at el. | 7B, 13B, 70B | [Lee, Hunter, and Ruiz 2023](https://arxiv.org/abs/2308.07317) |
21+
| Pythia by EleutherAI | {14,31,70,160,410}M, {1,1.4,2.8,6.9,12}B | [Biderman et al. 2023](https://arxiv.org/abs/2304.01373) |
22+
| RedPajama-INCITE by Together | 3B, 7B | [Together 2023](https://together.ai/blog/redpajama-models-v1) |
23+
| StableCode by Stability AI | 3B | [Stability AI 2023](https://stability.ai/blog/stablecode-llm-generative-ai-coding) |
24+
| StableLM by Stability AI | 3B, 7B | [Stability AI 2023](https://github.com/Stability-AI/StableLM) |
25+
| StableLM Zephyr by Stability AI | 3B | [Stability AI 2023](https://stability.ai/blog/stablecode-llm-generative-ai-coding) |
26+
| TinyLlama by Zhang et al. | 1.1B | [Zhang et al. 2023](https://github.com/jzhang38/TinyLlama) |
27+
| Vicuna by LMSYS | 7B, 13B, 33B | [Li et al. 2023](https://lmsys.org/blog/2023-03-30-vicuna/) |
28+
29+
30+
531
&nbsp;
632
## General Instructions
733

tutorials/finetune.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Finetuning
2+
3+
We provide a simple training scripts (`litgpt/finetune/*.py`) that instruction-tunes a pretrained model on the [Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset.
4+
For example, you can either use
5+
6+
LoRA ([Hu et al. 2021](https://arxiv.org/abs/2106.09685)):
7+
8+
```bash
9+
litgpt finetune lora
10+
```
11+
12+
or Adapter ([Zhang et al. 2023](https://arxiv.org/abs/2303.16199)):
13+
14+
```bash
15+
litgpt finetune adapter
16+
```
17+
18+
or Adapter v2 ([Gao et al. 2023](https://arxiv.org/abs/2304.15010)):
19+
20+
```bash
21+
litgpt finetune adapter_v2
22+
```
23+
24+
25+
The finetuning requires at least one GPU with ~12 GB memory (RTX 3060).
26+
27+
It is expected that you have downloaded the pretrained weights as described above.
28+
More details about each finetuning method and how you can apply it to your own data can be found in our technical how-to guides.
29+
30+
31+
### Finetuning how-to guides
32+
33+
These technical tutorials illustrate how to run the finetuning code.
34+
35+
- [Full-parameter finetuning](funetuning.md)
36+
- [Finetune with Adapters](finetune_adapter.md)
37+
- [Finetune with LoRA or QLoRA](finetune_lora.md)
38+
39+
&nbsp;
40+
41+
### Understanding finetuning -- conceptual tutorials
42+
43+
Looking for conceptual tutorials and explanations? We have some additional articles below:
44+
45+
- [Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters](https://lightning.ai/pages/community/article/understanding-llama-adapters/)
46+
47+
- [Parameter-Efficient LLM Finetuning With Low-Rank Adaptation (LoRA)](https://lightning.ai/pages/community/tutorial/lora-llm/)

0 commit comments

Comments
 (0)