Skip to content

Commit ca93cf7

Browse files
authored
Merge pull request #702 from Azure-Tang/update-readme
[UPDATE] Update documents.
2 parents 3ebe17e + c05ebb7 commit ca93cf7

File tree

2 files changed

+12
-10
lines changed

2 files changed

+12
-10
lines changed

doc/en/fp8_kernel.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,17 @@ The DeepSeek-AI team provides FP8 safetensors for DeepSeek-R1/V3 models. We achi
1010
So those who are persuing the best performance can use the FP8 linear kernel for DeepSeek-V3/R1.
1111

1212
## Key Features
13-
✅ Hybrid Precision Architecture (FP8 + GGML)
13+
14+
✅ Hybrid Precision Architecture (FP8 + GGML)<br>
1415
✅ Memory Optimization (~19GB VRAM usage)
1516

1617
## Quick Start
1718
### Using Pre-Merged Weights
1819

19-
Pre-merged weights are available on Hugging Face:
20-
[KVCache-ai/DeepSeek-V3-GGML-FP8-Hybrid](https://huggingface.co/KVCache-ai/DeepSeek-V3)
20+
Pre-merged weights are available on Hugging Face:<br>
21+
[KVCache-ai/DeepSeek-V3-GGML-FP8-Hybrid](https://huggingface.co/KVCache-ai/DeepSeek-V3)<br>
2122
[KVCache-ai/DeepSeek-R1-GGML-FP8-Hybrid](https://huggingface.co/KVCache-ai/DeepSeek-R1)
23+
2224
> Please confirm the weights are fully uploaded before downloading. The large file size may extend Hugging Face upload time.
2325
2426

@@ -32,12 +34,12 @@ pip install -U huggingface_hub
3234
huggingface-cli download --resume-download KVCache-ai/DeepSeek-V3-GGML-FP8-Hybrid --local-dir <local_dir>
3335
```
3436
### Using merge scripts
35-
If you got local DeepSeek-R1/V3 fp8 safetensors and q4km gguf weights, you can merge them using the following scripts.
37+
If you got local DeepSeek-R1/V3 fp8 safetensors and gguf weights(eg.q4km), you can merge them using the following scripts.
3638

3739
```shell
38-
python convert_model.py \
40+
python merge_tensors/merge_safetensor_gguf.py \
3941
--safetensor_path <fp8_safetensor_path> \
40-
--gguf_path <q4km_gguf_folder_path> \
42+
--gguf_path <gguf_folder_path> \
4143
--output_path <merged_output_path>
4244
```
4345

@@ -60,15 +62,15 @@ python ktransformers/local_chat.py \
6062

6163
## Notes
6264

63-
⚠️ Hardware Requirements
65+
⚠️ Hardware Requirements<br>
6466
* Recommended minimum 19GB available VRAM for FP8 kernel.
6567
* Requires GPU with FP8 support (e.g., 4090)
6668

6769
⏳ First-Run Optimization
6870
JIT compilation causes longer initial execution (subsequent runs retain optimized speed).
6971

70-
🔄 Temporary Interface
72+
🔄 Temporary Interface<br>
7173
Current weight loading implementation is provisional - will be refined in future versions
7274

73-
📁 Path Specification
75+
📁 Path Specification<br>
7476
Despite hybrid quantization, merged weights are stored as .safetensors - pass the containing folder path to `--gguf_path`

doc/en/install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ We provide a simple command-line local chat Python script that you can run for t
121121
mkdir DeepSeek-V2-Lite-Chat-GGUF
122122
cd DeepSeek-V2-Lite-Chat-GGUF
123123
124-
wget https://huggingface.co/mzwing/DeepSeek-V2-Lite-Chat-GGUF/resolve/main/DeepSeek-V2-Lite-Chat.Q4_K_M.gguf -O DeepSeek-V2-Lite-Chat.Q4_K_M.gguf
124+
wget https://huggingface.co/mradermacher/DeepSeek-V2-Lite-GGUF/resolve/main/DeepSeek-V2-Lite.Q4_K_M.gguf -O DeepSeek-V2-Lite-Chat.Q4_K_M.gguf
125125
126126
cd .. # Move to repo's root dir
127127

0 commit comments

Comments
 (0)