Skip to content

Commit a1219ae

Browse files
committed
Update document
1 parent 1343862 commit a1219ae

File tree

3 files changed

+33
-19
lines changed

3 files changed

+33
-19
lines changed

runtime/ggma/examples/generate_text/tinyllama.md renamed to runtime/ggma/examples/generate_text/README.md

Lines changed: 31 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# TinyLlama Example Documentation
1+
# TinyLlama Example
22

33
This document provides a step‑by‑step guide for generating and processing a text generation model.
44

@@ -32,34 +32,37 @@ This document provides a step‑by‑step guide for generating and processing a
3232

3333
## Generating Model Files
3434

35-
Run the provided scripts to create the prefill and decode Circle model files:
35+
1. Run the provided scripts to create the prefill and decode Circle model files:
3636

3737
```bash
38-
python prefill.py # Generates tinyllama.prefill.circle
39-
python decode.py # Generates tinyllama.decode.circle
38+
python prefill.py # Generates prefill.circle
39+
python decode.py # Generates decode_.circle
4040
```
4141

4242
You can verify the generated files:
4343

4444
```bash
4545
ls -lh *.circle
4646
# Expected output:
47-
# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 tinyllama.decode.circle
48-
# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 tinyllama.prefill.circle
47+
# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 decode_.circle
48+
# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 prefill.circle
4949
```
5050

51-
## Full Processing Pipeline
51+
2. Update tinyllama.decode.circle
5252

53-
The following pipeline shows how to chain several tools to transform the model:
53+
Add [tools/o2o](https://github.com/Samsung/ONE/pull/16233) to PATH.
5454

5555
```bash
56-
with.py tinyllama.decode.circle |
57-
fuse.attention.py \
58-
fuse.bmm_lhs_const.py | reshape.fc_weight.py | \
56+
export PATH=../../../../tools/o2o:$PATH
57+
```
58+
59+
Then, run the following:
60+
61+
```bash
62+
fuse.attention.py < decode_.circle | \
63+
fuse.bmm_lhs_const.py | \
5964
reshape.io.py input --by_shape [1,16,30,4] [1,16,32,4] | \
6065
transpose.io.kvcache.py | \
61-
remove.io.py output --keep_by_id 0 | \
62-
select.op.py --by_id 0-181 | \
6366
gc.py | \
6467
retype.input_ids.py > decode.circle
6568
```
@@ -68,17 +71,28 @@ retype.input_ids.py > decode.circle
6871

6972
| Tool | Purpose |
7073
|------|---------|
71-
| `with.py` | Reads the Circle model from stdin and writes it to stdout. |
7274
| `fuse.attention.py` | Fuses attention‑related operators for optimization. |
7375
| `fuse.bmm_lhs_const.py` | Fuses constant left‑hand side matrices in batch matrix multiplication. |
74-
| `reshape.fc_weight.py` | Reshapes fully‑connected layer weights. |
7576
| `reshape.io.py input --by_shape [...]` | Reshapes input tensors to the specified shapes. |
7677
| `transpose.io.kvcache.py` | Transposes the KV‑cache tensors. |
77-
| `remove.io.py output --keep_by_id 0` | Keeps only the output tensor with ID 0, removing the rest. |
78-
| `select.op.py --by_id 0-181` | Selects operators with IDs from 0 to 181. |
7978
| `gc.py` | Performs garbage collection, removing unused tensors and operators. |
8079
| `retype.input_ids.py` | Changes the data type of input IDs as needed. |
8180
| `> decode.circle` | Saves the final processed model to `decode.circle`. |
8281

8382

8483
Feel free to adjust the pipeline arguments (e.g., shapes, IDs) to suit your specific model configuration.
84+
85+
86+
3. Merge prefill and decode circle into 1 circle
87+
88+
```bash
89+
merge.circles.py prefill.circle decode.circle > tinyllama.circle
90+
```
91+
92+
```
93+
ls -lh *.circle
94+
-rw-rw-r-- 1 gyu gyu 18M Nov 21 17:43 decode.circle
95+
-rw-rw-r-- 1 gyu gyu 18M Nov 21 17:43 decode_.circle
96+
-rw-rw-r-- 1 gyu gyu 18M Nov 18 17:35 prefill.circle
97+
-rw-rw-r-- 1 gyu gyu 18M Nov 21 17:43 tinyllama.circle
98+
```

runtime/ggma/examples/generate_text/decode.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,4 +65,4 @@
6565
model = AutoModelForCausalLM.from_pretrained(model_name)
6666
model.eval()
6767
circle_model = tico.convert(model, captured_input)
68-
circle_model.save(f"tinyllama.decode.circle")
68+
circle_model.save(f"decode_.circle")

runtime/ggma/examples/generate_text/prefill.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,4 +72,4 @@
7272
model = AutoModelForCausalLM.from_pretrained(model_name)
7373
model.eval()
7474
circle_model = tico.convert(model, captured_input)
75-
circle_model.save(f"tinyllama.prefill.circle")
75+
circle_model.save(f"prefill.circle")

0 commit comments

Comments
 (0)