|  | 
| 3 | 3 | The purpose of this example is to demonstrate a minimal usage of llama.cpp for running models. | 
| 4 | 4 | 
 | 
| 5 | 5 | ```bash | 
| 6 |  | -./llama-run Meta-Llama-3.1-8B-Instruct.gguf | 
|  | 6 | +llama-run granite-code | 
|  | 7 | +... | 
|  | 8 | + | 
|  | 9 | +```bash | 
|  | 10 | +llama-run -h | 
|  | 11 | +Description: | 
|  | 12 | +  Runs a llm | 
|  | 13 | +
 | 
|  | 14 | +Usage: | 
|  | 15 | +  llama-run [options] model [prompt] | 
|  | 16 | +
 | 
|  | 17 | +Options: | 
|  | 18 | +  -c, --context-size <value> | 
|  | 19 | +      Context size (default: 2048) | 
|  | 20 | +  -n, --ngl <value> | 
|  | 21 | +      Number of GPU layers (default: 0) | 
|  | 22 | +  -h, --help | 
|  | 23 | +      Show help message | 
|  | 24 | +
 | 
|  | 25 | +Commands: | 
|  | 26 | +  model | 
|  | 27 | +      Model is a string with an optional prefix of | 
|  | 28 | +      huggingface:// (hf://), ollama://, https:// or file://. | 
|  | 29 | +      If no protocol is specified and a file exists in the specified | 
|  | 30 | +      path, file:// is assumed, otherwise if a file does not exist in | 
|  | 31 | +      the specified path, ollama:// is assumed. Models that are being | 
|  | 32 | +      pulled are downloaded with .partial extension while being | 
|  | 33 | +      downloaded and then renamed as the file without the .partial | 
|  | 34 | +      extension when complete. | 
|  | 35 | +
 | 
|  | 36 | +Examples: | 
|  | 37 | +  llama-run llama3 | 
|  | 38 | +  llama-run ollama://granite-code | 
|  | 39 | +  llama-run ollama://smollm:135m | 
|  | 40 | +  llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf | 
|  | 41 | +  llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf | 
|  | 42 | +  llama-run https://example.com/some-file1.gguf | 
|  | 43 | +  llama-run some-file2.gguf | 
|  | 44 | +  llama-run file://some-file3.gguf | 
|  | 45 | +  llama-run --ngl 99 some-file4.gguf | 
|  | 46 | +  llama-run --ngl 99 some-file5.gguf Hello World | 
| 7 | 47 | ... | 
0 commit comments