Skip to content

Commit 6eee1a8

Browse files
committed
fix(docs): readme
1 parent 030ebaf commit 6eee1a8

File tree

1 file changed

+0
-35
lines changed

1 file changed

+0
-35
lines changed

README.md

Lines changed: 0 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -31,42 +31,7 @@ Add proguard rule if it's enabled in project (android/app/proguard-rules.pro):
3131

3232
You can search HuggingFace for available models (Keyword: [`GGUF`](https://huggingface.co/search/full-text?q=GGUF&type=model)).
3333

34-
<<<<<<< Updated upstream
35-
For create a GGUF model manually, for example in Llama 2:
36-
37-
Download the Llama 2 model
38-
39-
1. Request access from [here](https://ai.meta.com/llama)
40-
2. Download the model from HuggingFace [here](https://huggingface.co/meta-llama/Llama-2-7b-chat) (`Llama-2-7b-chat`)
41-
42-
Convert the model to ggml format
43-
44-
```bash
45-
# Start with submodule in this repo (or you can clone the repo https://github.com/ggerganov/llama.cpp.git)
46-
yarn && yarn bootstrap
47-
cd llama.cpp
48-
49-
# install Python dependencies
50-
python3 -m pip install -r requirements.txt
51-
52-
# Move the Llama model weights to the models folder
53-
mv <path to Llama-2-7b-chat> ./models/7B
54-
55-
# convert the 7B model to ggml FP16 format
56-
python3 convert.py models/7B/ --outtype f16
57-
58-
# Build the quantize tool
59-
make quantize
60-
61-
# quantize the model to 2-bits (using q2_k method)
62-
./quantize ./models/7B/ggml-model-f16.gguf ./models/7B/ggml-model-q2_k.gguf q2_k
63-
64-
# quantize the model to 4-bits (using q4_0 method)
65-
./quantize ./models/7B/ggml-model-f16.gguf ./models/7B/ggml-model-q4_0.gguf q4_0
66-
```
67-
=======
6834
For get a GGUF model or quantize manually, see [`Prepare and Quantize`](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#prepare-and-quantize) section in llama.cpp.
69-
>>>>>>> Stashed changes
7035

7136
## Usage
7237

0 commit comments

Comments
 (0)