You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/examples/llama3-eagle3.md
+21-14Lines changed: 21 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,28 +1,34 @@
1
-
# Preproducing the draft model in the EAGLE3 paper
1
+
# Eagle3 for Llama3
2
2
3
-
This documents shows how to reproduce the training process of EAGLE3 paper. The script is in `examples/run_llama3_eagle3_sgl_online.sh`. This documents is a walk through of the script and explains all the middle points.
4
3
5
-
## Step0. Prepare environment
4
+
## Introduction
6
5
7
-
We suggest to use virtual environment to make sure all the dependency can be correctly installed. If you want to use `python>=3.12`, please set `export SETUPTOOLS_USE_DISTUTILS=local`.
6
+
This document provides a step-by-step guide to reproducing the training process described in the EAGLE3 paper, using the script `examples/run_llama3_eagle3_sgl_online.sh`. We will walk through the script and explain each key step along the way.
8
7
9
-
```
8
+
## Workflow
9
+
10
+
### Step 1. Prepare environment
11
+
12
+
We suggest to use a virtual environment to make sure that all the dependencies can be correctly installed. If you want to use `python>=3.12`, please set `export SETUPTOOLS_USE_DISTUTILS=local`.
13
+
14
+
```shell
10
15
uv venv --python 3.11
11
16
source .venv/bin/activate
12
17
cd PATH-TO-SpecForge
13
18
uv pip install -r requirements.txt
14
19
uv pip install -v .
15
20
```
16
21
17
-
After completing these steps, open a Python shell and run:
18
-
```python
19
-
import specforge
22
+
After completing these steps, you can check if the installation is successful by running the following command. You should not see any error if the installation is successful.
23
+
24
+
```shell
25
+
python -c "import specforge"
20
26
```
21
-
If the import succeeds without errors, Step 0 is complete.
22
27
23
-
## Step1. Prepare Model & Dataset
28
+
### Step 2. Prepare Model & Dataset
29
+
30
+
Next, we can start preparing the model and dataset. First, use these commands to download the model and the dataset.
24
31
25
-
First, use these command to download the model and the dataset.
After completing these steps, you can review the error entries in `error.jsonl`. Most of them will likely be `request timeout`. You can then decide whether you want to regenerate those samples. In my case, I chose not to, so I simply deleted error.jsonl before uploading to Hugging Face. The following command is used:
Alternatively, For `meta-llama/Llama-3.1-8B-Instruct`, you can use the dataset we generated: [zhuyksir/Ultrachat-Sharegpt-Llama3.1-8B](https://huggingface.co/datasets/zhuyksir/Ultrachat-Sharegpt-Llama3.1-8B).
@@ -99,6 +105,7 @@ Second, we need to pre-build the cache for training.
99
105
- Red text indicates tokens where `loss_mask == 0 (typically user input and system prompt)`. Since the goal is to train the draft model only on the target model’s output, user text must be masked out. In other words, only tokens generated by the target model should contribute to the loss.
100
106
101
107
- You might see this warning. `WARNING: No assistant response spans found in the conversation text.`This occurs when, during data generation, an error causes a sample to contain only user inputs without any assistant responses. You can safely ignore this warning—the loss mask for such samples is set entirely to zero.
For `Llama3.1-8B`, we add a system prompt to all training data, following the approach used in the official repository. Consequently, when benchmarking, we should also include this system prompt to obtain the full accept length. Please uncomment the corresponding line and add the system prompt.
0 commit comments