Skip to content

Commit fa98763

Browse files
author
Yinan
committed
Update PIM_README.md
1 parent 822a534 commit fa98763

File tree

1 file changed

+27
-4
lines changed

1 file changed

+27
-4
lines changed

PIM_README.md

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,30 @@ make LLAMA_PIM=1
1515
Prepare your model files as the original README.md shows. A 4-bit-quantified model in gguf format is prefered.
1616

1717
```
18-
./llama-cli -m /mnt/LLM-models/chinese-alpaca-2-7b/gguf/chinese-alpaca-7b_q4_0.gguf --temp 0 -t 1 --no-warmup -p "列举5个北京经典美食。只列举名字,不要介绍。"
18+
./llama-cli -m /mnt/LLM-models/chinese-alpaca-2-7b/gguf/chinese-alpaca-7b_q4_0.gguf \
19+
--temp 0 -t 1 --no-warmup -p "列举5个北京经典美食。只列举名字,不要介绍。"
20+
```
21+
22+
Which may output:
23+
```shell
24+
...
25+
sampler seed: 4294967295
26+
sampler params:
27+
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
28+
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.000
29+
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
30+
sampler chain: logits -> logit-bias -> penalties -> greedy
31+
generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 1
32+
33+
列举5个北京经典美食。只列举名字,不要介绍。1. 烤鸭 2. 炸酱面 3. 豆汁 4. 羊蝎子 5. 驴打滚 [end of text]
34+
35+
36+
llama_perf_sampler_print: sampling time = 1.02 ms / 49 runs ( 0.02 ms per token, 47804.88 tokens per second)
37+
llama_perf_context_print: load time = 4097.04 ms
38+
llama_perf_context_print: prompt eval time = 2966.36 ms / 16 tokens ( 185.40 ms per token, 5.39 tokens per second)
39+
llama_perf_context_print: eval time = 12105.60 ms / 32 runs ( 378.30 ms per token, 2.64 tokens per second)
40+
llama_perf_context_print: total time = 16206.10 ms / 48 tokens
41+
1942
```
2043

2144
## 3. llama-ts for tensor test
@@ -40,8 +63,8 @@ There are several macros defined in `include/llama.h` that controls the bahavior
4063
4164
```c++
4265
#ifdef PIM_KERNEL
43-
#define NR_DPUS 64
44-
#define NR_LAYER 2
66+
#define NR_DPUS 64 //Number of DPUs to execute the kernel
67+
#define NR_LAYER 2 //Number of transformer layers to offload
4568
#define DPU_BINARY "./dpu/gemv_dpu"
4669
...
4770
#endif // PIM_KERNEL
@@ -53,4 +76,4 @@ The PIM binary `dpu/gemv_dpu` is built from `dpu/dpu_main.c` by typing:
5376
cd dpu
5477
./pim_build.sh
5578
```
56-
So check `dpu/dpu_main.c` to find out how the kernel is implemented.
79+
Check `dpu/dpu_main.c` to find out how the kernel is implemented.

0 commit comments

Comments
 (0)