Skip to content

Commit edf7864

Browse files
Sanggyu Leeglistening
authored andcommitted
[ggma] Add gyu (ggma yielding utility) tool
Implement gyu CLI tool to automate GGMA model package creation: - Merge prefill.py and decode.py into unified export.py - Create modular gpm tool structure: - gyu/init.py: Setup venv, install deps (CPU-only torch), clone TICO, extract o2o tools - gyu/import.py: Download complete model from HuggingFace - gyu/export.py: Run conversion pipeline and create .ggma package - gyu/common.py: Shared utilities and constants - gyu/clean.py: Remove building directory - gyu/gyu: Bash wrapper to dispatch commands Documentation: - Rename README.md → DEVELOPER.md (technical guide) - Add USER.md (user-facing guide)
1 parent fd78f13 commit edf7864

File tree

14 files changed

+545
-154
lines changed

14 files changed

+545
-154
lines changed

runtime/ggma/examples/generate_text/README.md renamed to runtime/ggma/examples/generate_text/DEVELOPER.md

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# TinyLlama Text Generation Example
1+
# TinyLlama Text Generation Developer Guide
22

3-
This document provides a step‑by‑step guide for generating and processing a TinyLlama textgeneration model.
3+
This document provides a detailed technical guide for generating, processing, and optimizing the TinyLlama text-generation model. For basic usage, see [USER.md](USER.md).
44

55
## Summary
66

@@ -19,15 +19,18 @@ source _/bin/activate
1919

2020
### 2. Install required Python packages
2121
```bash
22-
pip install -r requirements.txt
22+
pip install -r tinyllama/tinyllama.requirements
2323
```
2424

25-
### 3. Install TICO (Torch IR to Circle ONE)
25+
### 3. Clone and Install TICO
2626
```bash
27-
# Clone the repository
28-
git clone https://github.com/Samsung/TICO.git
29-
# Install it in editable mode
30-
pip install -e TICO
27+
git clone --depth 1 https://github.com/Samsung/TICO.git
28+
cd TICO
29+
git fetch origin pull/418/head:pr-418
30+
git checkout pr-418
31+
cd ..
32+
pip install -r TICO/requirements.txt
33+
pip install -e TICO --extra-index-url https://download.pytorch.org/whl/cpu
3134
```
3235

3336
### 4. Get [o2o](https://github.com/Samsung/ONE/pull/16233) in PATH
@@ -37,12 +40,14 @@ gh pr checkout 16233
3740
export PATH=../../../../tools/o2o:$PATH
3841
```
3942

43+
44+
4045
## Generating Model Files
4146

4247
### 1. Create the prefill and decode Circle model files
4348
```bash
44-
python prefill.py # Generates prefill.circle
45-
python decode.py # Generates decode_.circle
49+
python tinyllama/tinyllama.py --mode prefill # Generates prefill.circle
50+
python tinyllama/tinyllama.py --mode decode # Generates decode_.circle
4651
```
4752

4853
Verify the generated files:
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# TinyLlama Text Generation User Guide
2+
3+
This guide shows how to create a GGMA package for the TinyLlama model using the `gyu` (GGMA Yielding Utility) tool.
4+
5+
## Quick Start
6+
7+
### 1. Initialize environment (one-time setup)
8+
9+
(Optional) Clean up everything:
10+
```bash
11+
gyu/gyu clean --all
12+
```
13+
14+
Initialize:
15+
```bash
16+
gyu/gyu init
17+
```
18+
19+
Result:
20+
```
21+
drwxrwxr-x 2 gyu gyu 4096 Nov 24 09:44 o2o
22+
drwxrwxr-x 6 gyu gyu 4096 Nov 24 09:42 venv
23+
```
24+
25+
### 2. Import model from HuggingFace
26+
27+
```bash
28+
gyu/gyu import Maykeye/TinyLLama-v0 -r tinyllama/tinyllama.requirements
29+
```
30+
31+
Result:
32+
```
33+
build
34+
└── tinyllama-v0
35+
├── backup
36+
├── config.json
37+
├── demo.py
38+
├── generation_config.json
39+
├── model.onnx
40+
├── model.safetensors
41+
├── pytorch_model.bin
42+
├── README.md
43+
├── special_tokens_map.json
44+
├── tokenizer_config.json
45+
├── tokenizer.json
46+
├── tokenizer.model
47+
├── train.ipynb
48+
└── valid.py
49+
```
50+
51+
### 3. Export to GGMA package
52+
53+
```bash
54+
gyu/gyu export -s tinyllama/tinyllama.py -p tinyllama/tinyllama.pipeline
55+
```
56+
57+
Result:
58+
```
59+
build/out/
60+
├── config.json
61+
├── model.circle
62+
├── tokenizer.json
63+
└── tokenizer.model
64+
```
65+
66+
## Build ggma_run
67+
68+
```bash
69+
# From ONE root directory
70+
make -j$(nproc)
71+
make install
72+
```
73+
74+
For detailed build instructions, see the [ONE Runtime Build Guide](https://github.com/Samsung/ONE/blob/master/docs/runtime/README.md).
75+
76+
## Run the GGMA package
77+
78+
```bash
79+
Product/out/bin/ggma_run build/out
80+
```
81+
82+
For detailed run instructions, see the [ggma_run guide](https://github.com/Samsung/ONE/blob/master/runtime/tests/tools/ggma_run/README.md).
83+
84+
85+
For developers who want to understand what happens under the hood, see [DEVELOPER.md](DEVELOPER.md).

runtime/ggma/examples/generate_text/decode.py

Lines changed: 0 additions & 68 deletions
This file was deleted.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#!/usr/bin/env python3
2+
import shutil
3+
import os
4+
5+
import argparse
6+
from common import VENV_DIR
7+
8+
9+
def main():
10+
parser = argparse.ArgumentParser(description="Clean build artifacts")
11+
parser.add_argument("--all",
12+
action="store_true",
13+
help="Remove all generated files including venv, TICO, and o2o")
14+
args = parser.parse_args()
15+
16+
# Always remove build directory
17+
build_dir = "build"
18+
if os.path.exists(build_dir):
19+
print(f"Removing {build_dir} directory...")
20+
shutil.rmtree(build_dir)
21+
else:
22+
print(f"{build_dir} directory does not exist.")
23+
24+
if args.all:
25+
dirs_to_remove = ["TICO", "o2o", VENV_DIR]
26+
for d in dirs_to_remove:
27+
if os.path.exists(d):
28+
print(f"Removing {d} directory...")
29+
shutil.rmtree(d)
30+
print("Full clean complete.")
31+
else:
32+
print("Clean complete.")
33+
34+
35+
if __name__ == "__main__":
36+
main()
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
import subprocess
2+
3+
# Constants
4+
VENV_DIR = "venv"
5+
PR_WORKTREE = "_pr_16233"
6+
PR_BRANCH = "pr-16233"
7+
PR_REF = "refs/pull/16233/head"
8+
9+
10+
def run_command(cmd, cwd=None, env=None, check=True):
11+
print(f"Running: {cmd}")
12+
subprocess.run(cmd, shell=True, cwd=cwd, env=env, check=check)
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
#!/usr/bin/env python3
2+
import argparse
3+
import shutil
4+
import json
5+
import yaml
6+
import os
7+
from common import VENV_DIR, run_command
8+
9+
10+
def main():
11+
parser = argparse.ArgumentParser(description="Export model to GGMA package")
12+
parser.add_argument("-s",
13+
"--script",
14+
required=True,
15+
help="Export script to use (e.g., tinyllama.py)")
16+
parser.add_argument("-p",
17+
"--pipeline",
18+
required=True,
19+
help="Pipeline configuration file (e.g., tinyllama.pipeline)")
20+
args = parser.parse_args()
21+
22+
export_script_name = args.script
23+
pipeline_config_path = os.path.abspath(args.pipeline)
24+
25+
# Change to build directory
26+
build_dir = "build"
27+
if not os.path.exists(build_dir):
28+
print(f"Error: {build_dir} directory does not exist. Run 'gyu import' first.")
29+
return
30+
31+
os.chdir(build_dir)
32+
33+
# Load pipeline configuration
34+
if not os.path.exists(pipeline_config_path):
35+
print(f"Error: Pipeline configuration file {pipeline_config_path} not found.")
36+
return
37+
38+
with open(pipeline_config_path, "r") as f:
39+
pipeline_config = yaml.safe_load(f)
40+
41+
# Find model directory by config.json
42+
model_dir = None
43+
model_id = None
44+
for d in os.listdir("."):
45+
config_path = os.path.join(d, "config.json")
46+
if os.path.isdir(d) and os.path.exists(config_path):
47+
model_dir = d
48+
# Read model ID from config.json
49+
with open(config_path, "r") as f:
50+
config = json.load(f)
51+
model_id = config.get("_name_or_path", d)
52+
print(f"Using local model directory: {model_dir}")
53+
print(f"Model ID from config: {model_id}")
54+
break
55+
56+
if not model_dir:
57+
raise ValueError("No local model directory found (directory with config.json)")
58+
59+
# Add o2o tools to PATH
60+
env = os.environ.copy()
61+
o2o_path = os.path.abspath("../o2o")
62+
env["PATH"] = f"{o2o_path}:{env['PATH']}"
63+
64+
python_bin = os.path.join("..", VENV_DIR, "bin", "python3")
65+
export_script = os.path.join("..", export_script_name)
66+
67+
# 1. Generate prefill and decode circles
68+
print(f"Running {export_script_name} (prefill)...")
69+
run_command(f"{python_bin} {export_script} --mode prefill --model {model_dir}",
70+
env=env)
71+
72+
print(f"Running {export_script_name} (decode)...")
73+
run_command(f"{python_bin} {export_script} --mode decode --model {model_dir}",
74+
env=env)
75+
76+
# Helper to run pipeline command
77+
def run_pipeline_step(step_name):
78+
if step_name in pipeline_config:
79+
print(f"Running {step_name} pipeline...")
80+
cmd = pipeline_config[step_name]
81+
# If cmd is a multiline string (from YAML |), it might contain newlines.
82+
# We can replace newlines with spaces or let the shell handle it if it's a single command string.
83+
# For safety with pipes, we replace newlines with spaces if they are meant to be a single line command.
84+
# But YAML block scalar preserves newlines.
85+
# If the user wrote it with pipes at the start of lines, we should join them.
86+
cmd = cmd.replace("\n", " ")
87+
run_command(cmd, env=env)
88+
89+
# 2. Pipeline (decode)
90+
run_pipeline_step("decode")
91+
92+
# 3. Merge
93+
run_pipeline_step("merge")
94+
95+
# 4. Create package directory and copy files
96+
# Find source directory with tokenizer.json
97+
source_dir = None
98+
for d in os.listdir("."):
99+
if os.path.isdir(d) and os.path.exists(os.path.join(d, "tokenizer.json")):
100+
source_dir = d
101+
break
102+
103+
if source_dir:
104+
package_dir = "out"
105+
print(f"Creating package directory {package_dir}...")
106+
os.makedirs(package_dir, exist_ok=True)
107+
108+
# Copy tokenizer and config files
109+
for filename in ["tokenizer.json", "tokenizer.model", "config.json"]:
110+
src = os.path.join(source_dir, filename)
111+
if os.path.exists(src):
112+
shutil.copy2(src, package_dir)
113+
114+
# Move model.circle
115+
print(f"Moving model.circle to {package_dir}...")
116+
shutil.move("model.circle", os.path.join(package_dir, "model.circle"))
117+
else:
118+
print(
119+
"Warning: Could not find source directory (directory with tokenizer.json). Leaving model.circle in current dir."
120+
)
121+
122+
123+
if __name__ == "__main__":
124+
main()
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#!/bin/bash
2+
SCRIPT_DIR=$(dirname "$0")
3+
COMMAND="$1"
4+
shift # Remove command from arguments
5+
6+
if [ "$COMMAND" == "init" ]; then
7+
python3 "$SCRIPT_DIR/$COMMAND.py" "$@"
8+
else
9+
if [ ! -f "venv/bin/python3" ]; then
10+
echo "Error: Environment not initialized. Run 'gyu init' first."
11+
exit 1
12+
fi
13+
venv/bin/python3 "$SCRIPT_DIR/$COMMAND.py" "$@"
14+
fi

0 commit comments

Comments
 (0)