Skip to content

Commit f8e735b

Browse files
committed
ollama: enable flash attention and k/v cache quantization
1 parent 3404306 commit f8e735b

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

Formula/o/ollama.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@ def install
5050
working_dir var
5151
log_path var/"log/ollama.log"
5252
error_log_path var/"log/ollama.log"
53+
environment_variables OLLAMA_FLASH_ATTENTION: "1",
54+
OLLAMA_KV_CACHE_TYPE: "q8_0"
5355
end
5456

5557
test do

0 commit comments

Comments
 (0)