Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Formula/l/llama.cpp.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
homepage "https://github.com/ggerganov/llama.cpp"
# CMake uses Git to generate version information.
url "https://github.com/ggerganov/llama.cpp.git",
tag: "b4034",
revision: "b8deef0ec0af5febac1d2cfd9119ff330ed0b762"
tag: "b4056",
revision: "5b359bb1e3585de45bec79fd6c18934897662cdf"
license "MIT"
head "https://github.com/ggerganov/llama.cpp.git", branch: "master"

Expand Down Expand Up @@ -58,7 +58,7 @@
}
end

test do

Check failure on line 61 in Formula/l/llama.cpp.rb

View workflow job for this annotation

GitHub Actions / macOS 13-arm64

`brew test --verbose llama.cpp` failed on macOS Ventura (13) on Apple Silicon!

llama_new_context_with_model: n_ctx_pre_seq (4096) > n_ctx_train (128) -- possible training context overflow ggml_metal_init: allocating ggml_metal_init: found device: Apple Paravirtual device ggml_metal_init: picking default device: Apple Paravirtual device ggml_metal_init: using embedded metal library ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:5225:15: error: zero-length arrays are not permitted in C++ o4x4_t lo[D16/NL]; ^~~~~~ program_source:5500:18: note: in instantiation of function template specialization 'kernel_flash_attn_ext_vec<half __attribute__((ext_vector_type(4))), metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, float, half, half __attribute__((ext_vector_type(4))), metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, 1, &dequantize_f16, metal::matrix<half, 4, 4, void>, 1, &dequantize_f16, 64, 1, 32>' requested here typedef decltype(kernel_flash_attn_ext_vec<FA_TYPES, half4x4, 1, dequantize_f16, half4x4, 1, dequantize_f16, 64>) flash_attn_ext_vec_t; ^ program_source:5266:19: error: zero-length arrays are not permitted in C++ q4x4_t mq[D16/NL]; ^~~~~~ " UserInfo={NSLocalizedDescription=program_source:5225:15: error: zero-length arrays are not permitted in C++ o4x4_t lo[D16/NL]; ^~~~~~ program_source:5500:18: note: in instantiation of function template specialization 'kernel_flash_attn_ext_vec<half __attribute__((ext_vector_type(4))), metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, float, half, half __attribute__((ext_vector_type(4))), metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, metal::matrix<half, 4, 4, void>, 1, &dequantize_f16, metal::matrix<half, 4, 4, void>, 1, &dequantize_f16, 64, 1, 32>' requested here typedef decltype(kernel_flash_attn_ext_vec<FA_TYPES, half4x4, 1, dequantize_f16, half4x4, 1, dequantize_f16, 64>) flash_attn_ext_vec_t; ^ program_source:5266:19: error: zero-length arrays are not permitted in C++ q4x4_t mq[D16/NL]; ^~~~~~ } ggml_backend_metal_device_init: error: failed to allocate context llama_new_context_with_model: failed to initialize Metal backend common_init_from_params: failed to create context with model 'stories260K.gguf' main: error: unable to load model ::error::llama.cpp: failed An exception occurred within a child process: BuildError: Failed executing: /opt/homebrew/Cellar/llama.cpp/4056/bin/llama-cli --hf-repo ggml-org/tiny-llamas -m stories260K.gguf -n 400 -p I -ngl 0 /opt/homebrew/Library/Homebrew/formula.rb:3116:in `block in system' /opt/homebrew/Library/Homebrew/formula.rb:3052:in `open' /opt/homebrew/Library/Homebrew/formula.rb:3052:in `system' /opt/homebrew/Library/Homebrew/vendor/bundle/ruby/3.3.0/gems/sorbet-runtime-0.5.11642/lib/types/private/methods/call_validation.rb:175:in `bind_call' /opt/homebrew/Library/Homebrew/vendor/bundle/ruby/3.3.0/gems/sorbet-runtime-0.5.11642/lib/types/private/methods/call_validation.rb:175:in `validate_call_skip_block_type' /opt/homebrew/Library/Homebrew/vendor/bundle/ruby/3.3.0/gems/sorbet-runtime-0.5.11642/lib/types/private/methods/call_validation.rb:117:in `block in create_validator_slow_skip_block_type' /opt/homebrew/Library/Taps/homebrew/homebrew-core/Formula/l/llama.cpp.rb:61:in `block in <class:LlamaCpp>' /opt/homebrew/Library/Homebrew/formula.rb:2849:in `block (3 levels) in run_test' /opt/homebrew/Library/Homebrew/extend/kernel.rb:539:in `with_env' /opt/homebrew/Library/Homebrew/formula.rb:2848:in `block (2 levels) in run_test' /opt/homebrew/Library/Homebrew/formula.rb:1205:in `with_logging' /opt/homebrew/Library/Homebrew/formula.rb:2847:in `block in run_test' /opt/homebrew/Library/Homebrew/mktemp.rb:90:in `block in run' /opt/homebrew/Library/Homebrew/mktemp.rb:90:in `chdir' /opt/homebrew/Library/Homebrew/mktemp.rb:90:in `run' /opt/homebrew/Library/Homebrew/vendor/bundle/r
system libexec/"test-sampling"
# The test below is flaky on slower hardware.
return if OS.mac? && Hardware::CPU.intel? && MacOS.version <= :monterey
Expand Down
Loading