You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*[redmond-puffin-13b](https://huggingface.co/TheBloke/Redmond-Puffin-13B-GGUF) from config.json (q4_K_S version works faster)
67
-
*[mistral-7b-instruct](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) from config.json (q4_K_S version works faster)
65
+
*[mistral-7b-instruct-v0.1](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) from config.json (q4_K_S version works faster)
66
+
*[Mistral-Nemo-Instruct-2407-GGUF](https://huggingface.co/ZeroWw/Mistral-Nemo-Instruct-2407-GGUF) from config.json (specific quants with output and embed tensors quantized to f16, q5 is the smallest)
67
+
*[redmond-puffin-13b (previously recommended)](https://huggingface.co/TheBloke/Redmond-Puffin-13B-GGUF) from config.json (q4_K_S version works faster)
68
68
69
69
### Additional notes
70
70
71
+
* Vulkan experimental build used [this PR](https://github.com/ggerganov/llama.cpp/pull/2059)
71
72
* ggmlv3 version is very old and almost deprecated for now, as almost no new models are using the old format
"prompt": "### SYSTEM: A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\n\n### USER:",
34
-
"reverse-prompt": "### USER:",
35
-
"input_suffix": "### ASSISTANT:",
36
-
"ctx-size": 4096,
37
-
"preset": "Midnight Enigma"
33
+
"ctx-size": 8192,
34
+
"n_gpu_layers_vk": 6,
35
+
"n_gpu_layers_clblast": 7,
36
+
"samplers_sequence": "ts",
37
+
"smoothing_factor": 0.1,
38
+
"smoothing_curve": 1.1,
39
+
"temp": 0.2,
40
+
"dynatemp_range": 0.2,
41
+
"group":
42
+
{
43
+
"prompt": "",
44
+
"reverse-prompt": "\n[INST] ",
45
+
"input_suffix": "[/INST]\n"
46
+
}
38
47
},
39
48
"models/mistral-7b-instruct-v0.2.Q4_K_S.gguf":
40
49
{
41
-
"prompt": "Below is an instruction that describes a task. Write an accurate response that appropriately completes the request.\n\n### Instruction:\n",
0 commit comments