Update document

glistening · glistening · commit a1219aef24cc · 2025-11-21T18:48:05.000+09:00
diff --git a/runtime/ggma/examples/generate_text/README.md b/runtime/ggma/examples/generate_text/README.md
@@ -1,4 +1,4 @@
-# TinyLlama Example Documentation
+# TinyLlama Example
 
 This document provides a step‑by‑step guide for generating and processing a text generation model.
 
@@ -32,34 +32,37 @@ This document provides a step‑by‑step guide for generating and processing a
 
 ## Generating Model Files
 
-Run the provided scripts to create the prefill and decode Circle model files:
+1. Run the provided scripts to create the prefill and decode Circle model files:
 
 ```bash
-python prefill.py   # Generates tinyllama.prefill.circle
-python decode.py    # Generates tinyllama.decode.circle
+python prefill.py   # Generates prefill.circle
+python decode.py    # Generates decode_.circle
 ```
 
 You can verify the generated files:
 
 ```bash
 ls -lh *.circle
 # Expected output:
-# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 tinyllama.decode.circle
-# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 tinyllama.prefill.circle
+# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 decode_.circle
+# -rw-rw-r-- 1 gyu gyu 18M Nov 14 14:09 prefill.circle
 ```
 
-## Full Processing Pipeline
+2. Update tinyllama.decode.circle
 
-The following pipeline shows how to chain several tools to transform the model:
+Add [tools/o2o](https://github.com/Samsung/ONE/pull/16233) to PATH.
 
 ```bash
-with.py tinyllama.decode.circle |
-fuse.attention.py \
-fuse.bmm_lhs_const.py | reshape.fc_weight.py | \
+export PATH=../../../../tools/o2o:$PATH
+```
+
+Then, run the following:
+
+```bash
+fuse.attention.py < decode_.circle | \
+fuse.bmm_lhs_const.py | \
 reshape.io.py input --by_shape [1,16,30,4] [1,16,32,4] | \
 transpose.io.kvcache.py | \
-remove.io.py output --keep_by_id 0 | \
-select.op.py --by_id 0-181 | \
 gc.py | \
 retype.input_ids.py > decode.circle
 ```
@@ -68,17 +71,28 @@ retype.input_ids.py > decode.circle
 
 | Tool | Purpose |
 |------|---------|
-| `with.py` | Reads the Circle model from stdin and writes it to stdout. |
 | `fuse.attention.py` | Fuses attention‑related operators for optimization. |
 | `fuse.bmm_lhs_const.py` | Fuses constant left‑hand side matrices in batch matrix multiplication. |
-| `reshape.fc_weight.py` | Reshapes fully‑connected layer weights. |
 | `reshape.io.py input --by_shape [...]` | Reshapes input tensors to the specified shapes. |
 | `transpose.io.kvcache.py` | Transposes the KV‑cache tensors. |
-| `remove.io.py output --keep_by_id 0` | Keeps only the output tensor with ID 0, removing the rest. |
-| `select.op.py --by_id 0-181` | Selects operators with IDs from 0 to 181. |
 | `gc.py` | Performs garbage collection, removing unused tensors and operators. |
 | `retype.input_ids.py` | Changes the data type of input IDs as needed. |
 | `> decode.circle` | Saves the final processed model to `decode.circle`. |
 
 
 Feel free to adjust the pipeline arguments (e.g., shapes, IDs) to suit your specific model configuration.
+
+
+3. Merge prefill and decode circle into 1 circle
+
+```bash
+merge.circles.py prefill.circle decode.circle > tinyllama.circle
+```
+
+```
+ls -lh *.circle
+-rw-rw-r-- 1 gyu gyu 18M Nov 21 17:43 decode.circle
+-rw-rw-r-- 1 gyu gyu 18M Nov 21 17:43 decode_.circle
+-rw-rw-r-- 1 gyu gyu 18M Nov 18 17:35 prefill.circle
+-rw-rw-r-- 1 gyu gyu 18M Nov 21 17:43 tinyllama.circle
+```
diff --git a/runtime/ggma/examples/generate_text/decode.py b/runtime/ggma/examples/generate_text/decode.py
@@ -65,4 +65,4 @@
 model = AutoModelForCausalLM.from_pretrained(model_name)
 model.eval()
 circle_model = tico.convert(model, captured_input)
-circle_model.save(f"tinyllama.decode.circle")
+circle_model.save(f"decode_.circle")
diff --git a/runtime/ggma/examples/generate_text/prefill.py b/runtime/ggma/examples/generate_text/prefill.py
@@ -72,4 +72,4 @@
 model = AutoModelForCausalLM.from_pretrained(model_name)
 model.eval()
 circle_model = tico.convert(model, captured_input)
-circle_model.save(f"tinyllama.prefill.circle")
+circle_model.save(f"prefill.circle")