Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit 1e2eec0

Browse files
authored
Merge branch 'main' into lessw2020/demo_metrics
2 parents 285860e + 7ad9ba2 commit 1e2eec0

File tree

8 files changed

+153
-194
lines changed

8 files changed

+153
-194
lines changed

README.md

Lines changed: 39 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ torchchat is a small codebase showcasing the ability to run large language model
2525

2626
## Highlights
2727

28+
- [[New!!] Multimodal Support for Llama 3.2 11B](docs/multimodal.md)
2829
- Command line interaction with popular LLMs such as Llama 3, Llama 2, Stories, Mistral and more
2930
- PyTorch-native execution with performance
3031
- Supports popular hardware and OS
@@ -37,6 +38,38 @@ torchchat is a small codebase showcasing the ability to run large language model
3738
- Multiple execution modes including: Python (Eager, Compile) or Native (AOT Inductor (AOTI), ExecuTorch)
3839

3940

41+
## Models
42+
43+
The following models are supported by torchchat and have associated
44+
aliases.
45+
46+
| Model | Mobile Friendly | Notes |
47+
|------------------|---|---------------------|
48+
|[meta-llama/Meta-Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)||Tuned for `chat` . Alias to `llama3.2-3b`.|
49+
|[meta-llama/Meta-Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)||Best for `generate`. Alias to `llama3.2-3b-base`.|
50+
|[meta-llama/Llama-Guard-3-1B](https://huggingface.co/meta-llama/Llama-Guard-3-1B)||Tuned for classification . Alias to `llama3-1b-guard`.|
51+
|[meta-llama/Meta-Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)||Tuned for `chat` . Alias to `llama3.2-1b`.|
52+
|[meta-llama/Meta-Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)||Best for `generate`. Alias to `llama3.2-1b-base`.|
53+
|[meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct)||Multimodal (Image + Text). Tuned for `chat` . Alias to `llama3.2-11B`.|
54+
|[meta-llama/Llama-3.2-11B-Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)||Multimodal (Image + Text). Tuned for `generate` . Alias to `llama3.2-11B-base`.|
55+
|[meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)||Tuned for `chat` . Alias to `llama3.1`.|
56+
|[meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)||Best for `generate`. Alias to `llama3.1-base`.|
57+
|[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)||Tuned for `chat` . Alias to `llama3`.|
58+
|[meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)||Best for `generate`. Alias to `llama3-base`.|
59+
|[meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)||Tuned for `chat`. Alias to `llama2`.|
60+
|[meta-llama/Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)||Tuned for `chat`. Alias to `llama2-13b-chat`.|
61+
|[meta-llama/Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)||Tuned for `chat`. Alias to `llama2-70b-chat`.|
62+
|[meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)||Best for `generate`. Alias to `llama2-base`.|
63+
|[meta-llama/CodeLlama-7b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-7b-Python-hf)||Tuned for Python and `generate`. Alias to `codellama`.|
64+
|[meta-llama/CodeLlama-34b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-34b-Python-hf)||Tuned for Python and `generate`. Alias to `codellama-34b`.|
65+
|[mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)||Best for `generate`. Alias to `mistral-7b-v01-base`.|
66+
|[mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)||Tuned for `chat`. Alias to `mistral-7b-v01-instruct`.|
67+
|[mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)||Tuned for `chat`. Alias to `mistral`.|
68+
|[tinyllamas/stories15M](https://huggingface.co/karpathy/tinyllamas/tree/main)||Toy model for `generate`. Alias to `stories15M`.|
69+
|[tinyllamas/stories42M](https://huggingface.co/karpathy/tinyllamas/tree/main)||Toy model for `generate`. Alias to `stories42M`.|
70+
|[tinyllamas/stories110M](https://huggingface.co/karpathy/tinyllamas/tree/main)||Toy model for `generate`. Alias to `stories110M`.|
71+
|[openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b)||Best for `generate`. Alias to `open-llama`.|
72+
4073
## Installation
4174
The following steps require that you have [Python 3.10](https://www.python.org/downloads/release/python-3100/) installed.
4275

@@ -105,7 +138,6 @@ __Evaluation__ (eval)
105138
* This command test model fidelity via EleutherAI's [lm_evaluation_harness](https://github.com/EleutherAI/lm-evaluation-harness).
106139
* More information is provided in the [Evaluation](https://github.com/pytorch/torchchat?tab=readme-ov-file#eval) section.
107140

108-
109141
## Download Weights
110142
Most models use Hugging Face as the distribution channel, so you will need to create a Hugging Face account.
111143
Create a Hugging Face user access token [as documented here](https://huggingface.co/docs/hub/en/security-tokens) with the `write` role.
@@ -118,9 +150,13 @@ Log into Hugging Face:
118150
huggingface-cli login
119151
```
120152

121-
Once this is done, torchchat will be able to download model artifacts from
122-
Hugging Face.
153+
Take a look at the available models:
123154

155+
```bash
156+
python3 torchchat.py list
157+
```
158+
159+
Then download one for testing (this README uses llama3.1)
124160
```
125161
python3 torchchat.py download llama3.1
126162
```
@@ -134,12 +170,6 @@ python3 torchchat.py download llama3.1
134170
<details>
135171
<summary>Additional Model Inventory Management Commands</summary>
136172

137-
### List
138-
This subcommand shows the available models
139-
```bash
140-
python3 torchchat.py list
141-
```
142-
143173
### Where
144174
This subcommand shows location of a particular model.
145175
```bash
@@ -511,44 +541,6 @@ the same way you would to generate:
511541
python3 torchchat.py eval llama3.1 --pte-path llama3.1.pte --limit 5
512542
```
513543

514-
515-
516-
## Models
517-
518-
The following models are supported by torchchat and have associated
519-
aliases.
520-
521-
| Model | Mobile Friendly | Notes |
522-
|------------------|---|---------------------|
523-
|[meta-llama/Meta-Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)||Tuned for `chat` . Alias to `llama3.2-3b`.|
524-
|[meta-llama/Meta-Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)||Best for `generate`. Alias to `llama3.2-3b-base`.|
525-
|[meta-llama/Llama-Guard-3-1B](https://huggingface.co/meta-llama/Llama-Guard-3-1B)||Tuned for classification . Alias to `llama3-1b-guard`.|
526-
|[meta-llama/Meta-Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)||Tuned for `chat` . Alias to `llama3.2-1b`.|
527-
|[meta-llama/Meta-Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)||Best for `generate`. Alias to `llama3.2-1b-base`.|
528-
|[meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct)||Multimodal (Image + Text). Tuned for `chat` . Alias to `llama3.2-11B`.|
529-
|[meta-llama/Llama-3.2-11B-Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)||Multimodal (Image + Text). Tuned for `generate` . Alias to `llama3.2-11B-base`.|
530-
|[meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)||Tuned for `chat` . Alias to `llama3.1`.|
531-
|[meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)||Best for `generate`. Alias to `llama3.1-base`.|
532-
|[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)||Tuned for `chat` . Alias to `llama3`.|
533-
|[meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)||Best for `generate`. Alias to `llama3-base`.|
534-
|[meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)||Tuned for `chat`. Alias to `llama2`.|
535-
|[meta-llama/Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)||Tuned for `chat`. Alias to `llama2-13b-chat`.|
536-
|[meta-llama/Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)||Tuned for `chat`. Alias to `llama2-70b-chat`.|
537-
|[meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)||Best for `generate`. Alias to `llama2-base`.|
538-
|[meta-llama/CodeLlama-7b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-7b-Python-hf)||Tuned for Python and `generate`. Alias to `codellama`.|
539-
|[meta-llama/CodeLlama-34b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-34b-Python-hf)||Tuned for Python and `generate`. Alias to `codellama-34b`.|
540-
|[mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)||Best for `generate`. Alias to `mistral-7b-v01-base`.|
541-
|[mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)||Tuned for `chat`. Alias to `mistral-7b-v01-instruct`.|
542-
|[mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)||Tuned for `chat`. Alias to `mistral`.|
543-
|[tinyllamas/stories15M](https://huggingface.co/karpathy/tinyllamas/tree/main)||Toy model for `generate`. Alias to `stories15M`.|
544-
|[tinyllamas/stories42M](https://huggingface.co/karpathy/tinyllamas/tree/main)||Toy model for `generate`. Alias to `stories42M`.|
545-
|[tinyllamas/stories110M](https://huggingface.co/karpathy/tinyllamas/tree/main)||Toy model for `generate`. Alias to `stories110M`.|
546-
|[openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b)||Best for `generate`. Alias to `open-llama`.|
547-
548-
While we describe how to use torchchat using the popular llama3 model,
549-
you can perform the example commands with any of these models.
550-
551-
552544
## Design Principles
553545

554546
torchchat embodies PyTorch’s design philosophy [details](https://pytorch.org/docs/stable/community/design.html), especially "usability over everything else".

docs/multimodal.md

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Released on September 25th, 2024, **Llama3.2 11B Vision** is torchchat's first multimodal model.
44

5-
This page goes over the different commands you can run with LLama 3.2 11B Vision.
5+
This page goes over the different commands you can run with LLama 3.2 11B Vision.
66

77
## Model Access
88

@@ -44,7 +44,42 @@ python3 torchchat.py server llama3.2-11B
4444

4545
In another terminal, query the server using `curl`. This query might take a few minutes to respond.
4646

47-
**We are currently debugging the server integration and will have updated examples shortly.**
47+
<details>
48+
<summary>Example Query</summary>
49+
50+
Setting `stream` to "true" in the request emits a response in chunks. If `stream` is unset or not "true", then the client will await the full response from the server.
51+
52+
**Example Input + Output**
53+
54+
```
55+
curl http://127.0.0.1:5000/v1/chat/completions \
56+
-H "Content-Type: application/json" \
57+
-d '{
58+
"model": "llama3.2",
59+
"messages": [
60+
{
61+
"role": "user",
62+
"content": [
63+
{
64+
"type": "text",
65+
"text": "What'\''s in this image?"
66+
},
67+
{
68+
"type": "image_url",
69+
"image_url": ""
70+
}
71+
]
72+
}
73+
],
74+
"max_tokens": 300
75+
}'
76+
```
77+
78+
```
79+
{"id": "chatcmpl-cb7b39af-a22e-4f71-94a8-17753fa0d00c", "choices": [{"message": {"role": "assistant", "content": "The image depicts a simple black and white cartoon-style drawing of an animal face. It features a profile view, complete with two ears, expressive eyes, and a partial snout. The animal looks to the left, with its eye and mouth implied, suggesting that the drawn face might belong to a rabbit, dog, or pig. The graphic face has a bold black outline and a smaller, solid black nose. A small circle, forming part of the face, has a white background with two black quirkly short and long curved lines forming an outline of what was likely a mouth, complete with two teeth. The presence of the curve lines give the impression that the animal is smiling or speaking. Grey and black shadows behind the right ear and mouth suggest that this face is looking left and upwards. Given the prominent outline of the head and the outline of the nose, it appears that the depicted face is most likely from the side profile of a pig, although the ears make it seem like a dog and the shape of the nose makes it seem like a rabbit. Overall, it seems that this image, possibly part of a character illustration, is conveying a playful or expressive mood through its design and positioning."}, "finish_reason": "stop"}], "created": 1727487574, "model": "llama3.2", "system_fingerprint": "cpu_torch.float16", "object": "chat.completion"}%
80+
```
81+
82+
</details>
4883

4984
## Browser
5085

@@ -58,8 +93,6 @@ First, follow the steps in the Server section above to start a local server. The
5893
streamlit run torchchat/usages/browser.py
5994
```
6095

61-
**We are currently debugging the browser integration and will have updated examples shortly.**
62-
6396
---
6497

6598
# Future Work

install/install_requirements.sh

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ PYTORCH_NIGHTLY_VERSION=dev20240901
5252
# Nightly version for torchvision
5353
VISION_NIGHTLY_VERSION=dev20240901
5454

55+
# Nightly version for torchtune
56+
TUNE_NIGHTLY_VERSION=dev20240928
57+
5558
# Uninstall triton, as nightly will depend on pytorch-triton, which is one and the same
5659
(
5760
set -x
@@ -72,6 +75,7 @@ fi
7275
REQUIREMENTS_TO_INSTALL=(
7376
torch=="2.5.0.${PYTORCH_NIGHTLY_VERSION}"
7477
torchvision=="0.20.0.${VISION_NIGHTLY_VERSION}"
78+
torchtune=="0.3.0.${TUNE_NIGHTLY_VERSION}"
7579
)
7680

7781
# Install the requirements. --extra-index-url tells pip to look for package
@@ -87,12 +91,6 @@ REQUIREMENTS_TO_INSTALL=(
8791
$PIP_EXECUTABLE install torchao=="0.5.0"
8892
)
8993

90-
# Rely on the latest tochtune for flamingo support
91-
(
92-
set -x
93-
$PIP_EXECUTABLE install -I git+https://github.com/pytorch/torchtune.git@d002d45e3ec700fa770d9dcc61b02c59e2507bf6
94-
)
95-
9694
if [[ -x "$(command -v nvidia-smi)" ]]; then
9795
(
9896
set -x

0 commit comments

Comments
 (0)