You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
help="output format - use f32 for float32, f16 for float16, bf16 for bfloat16, q8_0 for Q8_0, auto for the highest-fidelity 16-bit float type depending on the first loaded tensor type",
Copy file name to clipboardExpand all lines: examples/diffusion/README.md
+48-2Lines changed: 48 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,54 @@ More Info:
6
6
-https://github.com/ggml-org/llama.cpp/pull/14644
7
7
-https://github.com/ggml-org/llama.cpp/pull/14771
8
8
9
+
## Parameters
10
+
The diffusion CLI supports various parameters to control the generation process:
9
11
10
-
Example of using Dream architechture: `llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual`
12
+
### Core Diffusion Parameters
13
+
-`--diffusion-steps`: Number of diffusion steps (default: 256)
14
+
-`--diffusion-algorithm`: Algorithm for token selection
15
+
-`0`: ORIGIN - Token will be generated in a purely random order from https://arxiv.org/abs/2107.03006.
- More documentation here https://github.com/DreamLM/Dream
21
+
-`--diffusion-visual`: Enable live visualization during generation
11
22
12
-
Example of using LLaDA architechture: `llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual`
23
+
### Scheduling Parameters
24
+
Choose one of the following scheduling methods:
13
25
26
+
**Timestep-based scheduling:**
27
+
-`--diffusion-eps`: Epsilon value for timestep scheduling (e.g., 0.001)
28
+
29
+
**Block-based scheduling:**
30
+
-`--diffusion-block-length`: Block size for block-based scheduling (e.g., 32)
31
+
32
+
### Sampling Parameters
33
+
-`--temp`: Temperature for sampling (0.0 = greedy/deterministic, higher = more random)
34
+
-`--top-k`: Top-k filtering for sampling
35
+
-`--top-p`: Top-p (nucleus) filtering for sampling
36
+
-`--seed`: Random seed for reproducibility
37
+
38
+
### Model Parameters
39
+
-`-m`: Path to the GGUF model file
40
+
-`-p`: Input prompt text
41
+
-`-ub`: Maximum sequence length (ubatch size)
42
+
-`-c`: Context size
43
+
-`-b`: Batch size
44
+
45
+
### Examples
46
+
#### Dream architechture:
47
+
```
48
+
llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual
49
+
```
50
+
51
+
#### LLaDA architechture:
52
+
```
53
+
llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual
0 commit comments