@@ -6,7 +6,7 @@ The motivation for having this is that the conversion process can often be an
6
6
iterative process, where the original model is inspected, converted, updates
7
7
made to llama.cpp, converted again, etc. Once the model has been converted it
8
8
needs to be verified against the original model, and then optionally quantified,
9
- and is some cases perplexity checked of the quantized model. And finally the
9
+ and in some cases perplexity checked of the quantized model. And finally the
10
10
model/models need to the ggml-org on Hugging Face. This tool/example tries to
11
11
help with this process.
12
12
@@ -62,7 +62,7 @@ Command line arguments take precedence over environment variables when both are
62
62
63
63
In cases where the transformer implementation for the model has not been released
64
64
yet it is possible to set the environment variable ` UNRELEASED_MODEL_NAME ` which
65
- will the cause the transformer implementation to be loaded explicitely and not
65
+ will then cause the transformer implementation to be loaded explicitely and not
66
66
use AutoModelForCausalLM:
67
67
```
68
68
export UNRELEASED_MODEL_NAME=SomeNewModel
@@ -87,7 +87,7 @@ from the converted model.
87
87
# Or using command line argument
88
88
(venv) $ make causal-run-original-model MODEL_PATH=~/work/ai/models/some_model
89
89
```
90
- This command will save two file to the ` data ` directory, one is a binary file
90
+ This command will save two files to the ` data ` directory, one is a binary file
91
91
containing logits which will be used for comparison with the converted model
92
92
later, and the other is a text file which allows for manual visual inspection.
93
93
@@ -128,11 +128,11 @@ Quantized model saved to: /path/to/quantized/model-Q8_0.gguf
128
128
Export the quantized model path to QUANTIZED_MODEL variable in your environment
129
129
```
130
130
This will show the path to the quantized model in the terminal, which can then
131
- be used set the ` QUANTIZED_MODEL ` environment variable:
131
+ be used to set the ` QUANTIZED_MODEL ` environment variable:
132
132
``` console
133
133
export QUANTIZED_MODEL=/path/to/quantized/model-Q8_0.gguf
134
134
```
135
- The the quantized model can be run using the following command:
135
+ Then the quantized model can be run using the following command:
136
136
``` console
137
137
(venv) $ make causal-run-quantized-model
138
138
```
@@ -229,11 +229,11 @@ Quantized model saved to: /path/to/quantized/model-Q8_0.gguf
229
229
Export the quantized model path to QUANTIZED_EMBEDDING_MODEL variable in your environment
230
230
```
231
231
This will show the path to the quantized model in the terminal, which can then
232
- be used set the ` QUANTIZED_EMBEDDING_MODEL ` environment variable:
232
+ be used to set the ` QUANTIZED_EMBEDDING_MODEL ` environment variable:
233
233
``` console
234
234
export QUANTIZED_EMBEDDING_MODEL=/path/to/quantized/model-Q8_0.gguf
235
235
```
236
- The the quantized model can be run using the following command:
236
+ Then the quantized model can be run using the following command:
237
237
``` console
238
238
(venv) $ make embedding-run-quantized-model
239
239
```
@@ -246,7 +246,7 @@ token/logits file:
246
246
``` console
247
247
(venv) $ make perplexity-run QUANTIZED_MODEL=~/path/to/quantized/model.gguf
248
248
```
249
- This will use the wikitext dataset to run the perplexity evaluation and and
249
+ This will use the wikitext dataset to run the perplexity evaluation and
250
250
output the perplexity score to the terminal. This value can then be compared
251
251
with the perplexity score of the unquantized model.
252
252
0 commit comments