Skip to content

Commit 094caea

Browse files
authored
Merge branch 'ggerganov:master' into master
2 parents 63e60de + f4b2dcd commit 094caea

File tree

3 files changed

+21
-6
lines changed

3 files changed

+21
-6
lines changed

examples/main/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ In this section, we cover the most commonly used options for running the `llama-
6969
- `-c N, --ctx-size N`: Set the size of the prompt context. The default is 512, but LLaMA models were built with a context of 2048, which will provide better results for longer input/inference.
7070
- `-mli, --multiline-input`: Allows you to write or paste multiple lines without ending each in '\'
7171
- `-t N, --threads N`: Set the number of threads to use during generation. For optimal performance, it is recommended to set this value to the number of physical CPU cores your system has.
72-
- - `-ngl N, --n-gpu-layers N`: When compiled with GPU support, this option allows offloading some layers to the GPU for computation. Generally results in increased performance.
72+
- `-ngl N, --n-gpu-layers N`: When compiled with GPU support, this option allows offloading some layers to the GPU for computation. Generally results in increased performance.
7373

7474
## Input Prompts
7575

ggml/src/ggml-vulkan.cpp

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1070,10 +1070,25 @@ static vk_buffer ggml_vk_create_buffer(vk_device& device, size_t size, vk::Memor
10701070
try {
10711071
buf->device_memory = device->device.allocateMemory({ mem_req.size, memory_type_index });
10721072
} catch (const vk::SystemError& e) {
1073-
// Out of Host/Device memory, clean up buffer
1074-
device->device.destroyBuffer(buf->buffer);
1075-
buf->size = 0;
1076-
throw e;
1073+
if (buf->memory_property_flags != fallback_flags) {
1074+
// Try again with fallback flags
1075+
memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags);
1076+
buf->memory_property_flags = fallback_flags;
1077+
1078+
try {
1079+
buf->device_memory = device->device.allocateMemory({ mem_req.size, memory_type_index });
1080+
}
1081+
catch (const vk::SystemError& e) {
1082+
device->device.destroyBuffer(buf->buffer);
1083+
buf->size = 0;
1084+
throw e;
1085+
}
1086+
} else {
1087+
// Out of Host/Device memory, clean up buffer
1088+
device->device.destroyBuffer(buf->buffer);
1089+
buf->size = 0;
1090+
throw e;
1091+
}
10771092
}
10781093
buf->ptr = nullptr;
10791094

scripts/sync-ggml.last

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0d7ecbbe536dc84240f646e0ec0a712251377f34
1+
564f42082f858f9674b2a2e06e9e779d9ed2c754

0 commit comments

Comments
 (0)