Skip to content

Commit e1d1b3b

Browse files
Sanggyu Leeglistening
authored andcommitted
[ggma] Add gyu (ggma yielding utility) tool
Implement gyu CLI tool to automate GGMA model package creation: - Merge prefill.py and decode.py into unified export.py - Create modular gpm tool structure: - gyu/init.py: Setup venv, install deps (CPU-only torch), clone TICO, extract o2o tools - gyu/import.py: Download complete model from HuggingFace - gyu/export.py: Run conversion pipeline and create .ggma package - gyu/common.py: Shared utilities and constants - gyu/clean.py: Remove building directory - gyu/gyu: Bash wrapper to dispatch commands Documentation: - Rename README.md → DEVELOPER.md (technical guide) - Add USER.md (user-facing guide)
1 parent fd78f13 commit e1d1b3b

File tree

14 files changed

+573
-177
lines changed

14 files changed

+573
-177
lines changed
Lines changed: 38 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# TinyLlama Text Generation Example
1+
# TinyLlama Text Generation Developer Guide
22

3-
This document provides a step‑by‑step guide for generating and processing a TinyLlama textgeneration model.
3+
This document provides a detailed technical guide for generating, processing, and optimizing the TinyLlama text-generation model. For basic usage, see [USER.md](USER.md).
44

55
## Summary
66

@@ -12,42 +12,47 @@ This document provides a step‑by‑step guide for generating and processing a
1212

1313
### 1. Python virtual environment
1414
```bash
15-
cd runtime/ggma/examples/generate_text/
16-
python3 -m venv _
17-
source _/bin/activate
15+
$ cd runtime/ggma/examples/generate_text/
16+
$ python3 -m venv _
17+
$ source _/bin/activate
1818
```
1919

2020
### 2. Install required Python packages
2121
```bash
22-
pip install -r requirements.txt
22+
$ pip install -r tinyllama/tinyllama.requirements
2323
```
2424

25-
### 3. Install TICO (Torch IR to Circle ONE)
25+
### 3. Clone and Install TICO
2626
```bash
27-
# Clone the repository
28-
git clone https://github.com/Samsung/TICO.git
29-
# Install it in editable mode
30-
pip install -e TICO
27+
$ git clone --depth 1 https://github.com/Samsung/TICO.git
28+
$ cd TICO
29+
$ git fetch origin pull/418/head:pr-418
30+
$ git checkout pr-418
31+
$ cd ..
32+
$ pip install -r TICO/requirements.txt
33+
$ pip install -e TICO --extra-index-url https://download.pytorch.org/whl/cpu
3134
```
3235

3336
### 4. Get [o2o](https://github.com/Samsung/ONE/pull/16233) in PATH
3437
*Requires the GitHub CLI (`gh`).*
3538
```bash
36-
gh pr checkout 16233
37-
export PATH=../../../../tools/o2o:$PATH
39+
$ gh pr checkout 16233
40+
$ export PATH=../../../../tools/o2o:$PATH
3841
```
3942

43+
44+
4045
## Generating Model Files
4146

4247
### 1. Create the prefill and decode Circle model files
4348
```bash
44-
python prefill.py # Generates prefill.circle
45-
python decode.py # Generates decode_.circle
49+
$ python tinyllama/tinyllama.py --mode prefill # Generates prefill.circle
50+
$ python tinyllama/tinyllama.py --mode decode # Generates decode_.circle
4651
```
4752

4853
Verify the generated files:
4954
```bash
50-
ls -lh *.circle
55+
$ ls -lh *.circle
5156
# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 decode_.circle
5257
# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 prefill.circle
5358
```
@@ -57,7 +62,7 @@ Fuse attention and normalize KV-cache inputs for the decode model.
5762

5863
```bash
5964
# Fuse attention and reshape KV-cache for the decode model
60-
fuse.attention.py < decode_.circle \
65+
$ fuse.attention.py < decode_.circle \
6166
| fuse.bmm_lhs_const.py \
6267
| reshape.io.py input --by_shape [1,16,30,4] [1,16,32,4] \
6368
| transpose.io.kvcache.py > decode.circle
@@ -67,14 +72,14 @@ fuse.attention.py < decode_.circle \
6772
Merge the models, retype input IDs, and clean up.
6873

6974
```bash
70-
merge.circles.py prefill.circle decode.circle \
75+
$ merge.circles.py prefill.circle decode.circle \
7176
| downcast.input_ids.py \
7277
| gc.py > model.circle
7378
```
7479

7580
Verify final model files:
7681
```bash
77-
ls -l {decode,prefill,model}.circle
82+
$ ls -l {decode,prefill,model}.circle
7883
# -rw-rw-r-- 1 gyu gyu 18594868 Nov 22 17:26 decode.circle
7984
# -rw-rw-r-- 1 gyu gyu 18642052 Nov 22 07:53 prefill.circle
8085
# -rw-rw-r-- 1 gyu gyu 18629520 Nov 22 17:28 model.circle
@@ -84,19 +89,19 @@ ls -l {decode,prefill,model}.circle
8489

8590
1. Create the package root directory and move `model.circle` there:
8691
```bash
87-
cd runtime/ggma/examples/generate_text
88-
mkdir tinyllama
89-
mv model.circle tinyllama/
92+
$ cd runtime/ggma/examples/generate_text
93+
$ mkdir tinyllama
94+
$ mv model.circle tinyllama/
9095
```
9196

9297
2. Copy the tokenizer files (replace `{your_snapshot}` with the actual snapshot hash):
9398
```bash
94-
cp -L ~/.cache/huggingface/hub/models--Maykeye--TinyLLama-v0/snapshots/{your_snapshot}/tokenizer.* tinyllama/
95-
cp -L ~/.cache/huggingface/hub/models--Maykeye--TinyLLama-v0/snapshots/{your_snapshot}/config.json tinyllama/
99+
$ cp -L ~/.cache/huggingface/hub/models--Maykeye--TinyLLama-v0/snapshots/{your_snapshot}/tokenizer.* tinyllama/
100+
$ cp -L ~/.cache/huggingface/hub/models--Maykeye--TinyLLama-v0/snapshots/{your_snapshot}/config.json tinyllama/
96101
```
97102

98103
```bash
99-
tree tinyllama/
104+
$ tree tinyllama/
100105
tinyllama/
101106
├── model.circle
102107
├── tokenizer.json
@@ -106,20 +111,20 @@ tinyllama/
106111
## Build and run `ggma_run`
107112

108113
```bash
109-
make -j$(nproc)
110-
make install
114+
$ make -j$(nproc)
115+
$ make install
111116
```
112117

113118
Check version:
114119
```bash
115-
Product/out/bin/ggma_run --version
116-
# ggma_run v0.1.0 (nnfw runtime: v1.31.0)
120+
$ Product/out/bin/ggma_run --version
121+
ggma_run v0.1.0 (nnfw runtime: v1.31.0)
117122
```
118123

119124
Run the model:
120125
```bash
121-
Product/out/bin/ggma_run tinyllama
122-
# prompt: Lily picked up a flower.
123-
# generated: { 1100, 7899, 289, 826, 351, 600, 2439, 288, 266, 3653, 31843, 1100, 7899, 289, 1261, 291, 5869, 291, 1261, 31843, 1100, 7899 }
124-
# detokenized: She liked to play with her friends in the park. She liked to run and jump and run. She liked
126+
$ Product/out/bin/ggma_run tinyllama
127+
prompt: Lily picked up a flower.
128+
generated: { 1100, 7899, 289, 826, 351, 600, 2439, 288, 266, 3653, 31843, 1100, 7899, 289, 1261, 291, 5869, 291, 1261, 31843, 1100, 7899 }
129+
detokenized: She liked to play with her friends in the park. She liked to run and jump and run. She liked
125130
```
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# TinyLlama Text Generation User Guide
2+
3+
This guide shows how to create a GGMA package for the TinyLlama model using the `gyu` (GGMA Yielding Utility) tool.
4+
5+
## Quick Start
6+
7+
### 1. Initialize environment (one-time setup)
8+
9+
```bash
10+
$ gyu/gyu init
11+
```
12+
13+
Python environment (`venv`) and o2o tools are created:
14+
```bash
15+
$ ls -ld o2o venv
16+
drwxrwxr-x 2 gyu gyu 4096 Nov 24 09:44 o2o
17+
drwxrwxr-x 6 gyu gyu 4096 Nov 24 09:42 venv
18+
```
19+
20+
> **Note**: The `o2o` directory will be removed once [PR #13689](https://github.com/Samsung/ONE/pull/13689) is merged.
21+
22+
### 2. Import model from HuggingFace
23+
24+
```bash
25+
$ gyu/gyu import Maykeye/TinyLLama-v0 -r tinyllama/tinyllama.requirements
26+
```
27+
28+
The HuggingFace model is downloaded to `build/tinyllama-v0/`:
29+
```
30+
build
31+
└── tinyllama-v0
32+
├── backup
33+
├── config.json
34+
├── demo.py
35+
├── generation_config.json
36+
├── model.onnx
37+
├── model.safetensors
38+
├── pytorch_model.bin
39+
├── README.md
40+
├── special_tokens_map.json
41+
├── tokenizer_config.json
42+
├── tokenizer.json
43+
├── tokenizer.model
44+
├── train.ipynb
45+
└── valid.py
46+
```
47+
48+
### 3. Export to GGMA package
49+
50+
```bash
51+
$ gyu/gyu export -s tinyllama/tinyllama.py -p tinyllama/tinyllama.pipeline
52+
```
53+
54+
The GGMA package is generated in `build/out/`:
55+
```
56+
build/out/
57+
├── config.json
58+
├── model.circle
59+
├── tokenizer.json
60+
└── tokenizer.model
61+
```
62+
63+
## Build ggma_run
64+
65+
```bash
66+
# From ONE root directory
67+
$ make -j$(nproc)
68+
$ make install
69+
```
70+
71+
For detailed build instructions, see the [ONE Runtime Build Guide](https://github.com/Samsung/ONE/blob/master/docs/runtime/README.md).
72+
73+
Confirm that `ggma_run` is built and show its version:
74+
```bash
75+
$ Product/out/bin/ggma_run --version
76+
ggma_run v0.1.0 (nnfw runtime: v1.31.0)
77+
```
78+
79+
Execute the GGMA package (default prompt) to see a sample output:
80+
```bash
81+
$ Product/out/bin/ggma_run build/out
82+
prompt: Lily picked up a flower.
83+
generated: { 1100, 7899, 289, 826, 351, 600, 2439, 288, 266, 3653, 31843, 1100, 7899, 289, 1261, 291, 5869, 291, 1261, 31843, 1100, 7899 }
84+
detokenized: She liked to play with her friends in the park. She liked to run and jump and run. She liked
85+
```
86+
87+
For detailed run instructions, see the [ggma_run guide](https://github.com/Samsung/ONE/blob/master/runtime/tests/tools/ggma_run/README.md).
88+
89+
90+
For developers who want to understand what happens under the hood, see [DEVELOPER.md](DEVELOPER.md).

runtime/ggma/examples/generate_text/decode.py

Lines changed: 0 additions & 68 deletions
This file was deleted.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#!/usr/bin/env python3
2+
import shutil
3+
import os
4+
5+
import argparse
6+
from common import VENV_DIR
7+
8+
9+
def main():
10+
parser = argparse.ArgumentParser(description="Clean build artifacts")
11+
parser.add_argument("--all",
12+
action="store_true",
13+
help="Remove all generated files including venv, TICO, and o2o")
14+
args = parser.parse_args()
15+
16+
# Always remove build directory
17+
build_dir = "build"
18+
if os.path.exists(build_dir):
19+
print(f"Removing {build_dir} directory...")
20+
shutil.rmtree(build_dir)
21+
else:
22+
print(f"{build_dir} directory does not exist.")
23+
24+
if args.all:
25+
dirs_to_remove = ["TICO", "o2o", VENV_DIR]
26+
for d in dirs_to_remove:
27+
if os.path.exists(d):
28+
print(f"Removing {d} directory...")
29+
shutil.rmtree(d)
30+
print("Full clean complete.")
31+
else:
32+
print("Clean complete.")
33+
34+
35+
if __name__ == "__main__":
36+
main()
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
import subprocess
2+
3+
# Constants
4+
VENV_DIR = "venv"
5+
PR_WORKTREE = "_pr_16233"
6+
PR_BRANCH = "pr-16233"
7+
PR_REF = "refs/pull/16233/head"
8+
9+
10+
def run_command(cmd, cwd=None, env=None, check=True):
11+
print(f"Running: {cmd}")
12+
subprocess.run(cmd, shell=True, cwd=cwd, env=env, check=check)

0 commit comments

Comments
 (0)