Replies: 8 comments 21 replies
-
|
How can I tell it to use my AMD GPU? |
Beta Was this translation helpful? Give feedback.
-
|
Wondering if there is a way to change the number of workers and threads, optimizations, etc.? Running on a 32-core hyperthreading system, and wondering if it can be manually optimized to maximize use of all CPU cores for all models. |
Beta Was this translation helpful? Give feedback.
-
|
In model download window there are no quantized variants to be seen anymore. But there used to be. A bug or a feature? As if the .gguf downloading support is gone... |
Beta Was this translation helpful? Give feedback.
-
|
Why do I randomly see the app in the background app list, even when I didn't enable the option to do so? (There's no portal description in the entry either) |
Beta Was this translation helpful? Give feedback.
-
|
How do I set the context size for a local ollama instance? The default is only 2048, when the underlying models support 32K or 64K, but that must be included as a request parameter. |
Beta Was this translation helpful? Give feedback.
-
How to use GPU in Alpaca please?This isn't explained on https://flathub.org/apps/com.jeffser.Alpaca. I ran these commands: flatpak install flathub com.jeffser.Alpaca -y;
flatpak install com.jeffser.Alpaca.Plugins.Ollama -y;And in Flatseal, enabled GPU acceleration for Alpaca. Then ran Alpaca and installed Qwen2.5 coder model. When I ask it a question my CPU usage rises to 50%, and uses 10GB RAM, but the NVIDIA app shows my GPU usage at its normal 20%. Output is extremely slow. So I think it's not using GPU. What should I do please? Here's my Alpaca log file for the instance: print_info: ssm_d_inner = 0
print_info: ssm_d_state = 0
print_info: ssm_dt_rank = 0
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 14B
print_info: model params = 14.77 B
print_info: general.name = Qwen2.5 Coder 14B Instruct
print_info: vocab type = BPE
print_info: n_vocab = 152064
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: CPU model buffer size = 8566.04 MiB
llama_init_from_model: n_seq_max = 4
llama_init_from_model: n_ctx = 8192
llama_init_from_model: n_ctx_per_seq = 2048
llama_init_from_model: n_batch = 2048
llama_init_from_model: n_ubatch = 512
llama_init_from_model: flash_attn = 0
llama_init_from_model: freq_base = 1000000.0
llama_init_from_model: freq_scale = 1
llama_init_from_model: n_ctx_per_seq (2048) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
llama_kv_cache_init: kv_size = 8192, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 48, can_shift = 1
llama_kv_cache_init: CPU KV buffer size = 1536.00 MiB
llama_init_from_model: KV self size = 1536.00 MiB, K (f16): 768.00 MiB, V (f16): 768.00 MiB
llama_init_from_model: CPU output buffer size = 2.40 MiB
llama_init_from_model: CPU compute buffer size = 696.01 MiB
llama_init_from_model: graph nodes = 1686
llama_init_from_model: graph splits = 1
time=2025-04-26T18:58:23.282+02:00 level=INFO source=server.go:619 msg="llama runner started in 7.03 seconds"
[GIN] 2025/04/26 - 18:58:40 | 200 | 25.040628849s | 127.0.0.1 | POST "/v1/chat/completions"
[GIN] 2025/04/26 - 18:58:47 | 200 | 31.493391109s | 127.0.0.1 | POST "/v1/chat/completions"
[GIN] 2025/04/26 - 19:00:50 | 200 | 822.202µs | 127.0.0.1 | GET "/api/tags"And here's +-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:09:00.0 On | N/A |
| 0% 54C P8 16W / 170W | 827MiB / 12288MiB | 13% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3315 G /usr/lib/xorg/Xorg 241MiB |
| 0 N/A N/A 3539 G /usr/bin/gnome-shell 184MiB |
| 0 N/A N/A 3777 G /usr/bin/ckb-next 2MiB |
| 0 N/A N/A 4417 G /usr/libexec/xdg-desktop-portal-gnome 63MiB |
| 0 N/A N/A 4463 G ...erProcess --variations-seed-version 45MiB |
| 0 N/A N/A 8671 G /app/lib/firefox/firefox 161MiB |
| 0 N/A N/A 283017 G /usr/bin/gnome-system-monitor 25MiB |
| 0 N/A N/A 289563 C+G /usr/bin/gjs 24MiB |
| 0 N/A N/A 289762 G ...erProcess --variations-seed-version 47MiB |
| 0 N/A N/A 295618 G /usr/bin/nvidia-settings 0MiB |
+-----------------------------------------------------------------------------------------+ |
Beta Was this translation helpful? Give feedback.
-
|
I have some ideas of new features for the Alpaca app.
I hope you like my ideas and have enought time to code it. |
Beta Was this translation helpful? Give feedback.
-
|
I've got a weird problem with the 8.4.0 version. Whenever I try to add a new instance of Deepseek or any other provider I get a "Connection Failed." error. Furthermore, I'm not able to edit the existing Ollama (Managed) instance, the save button does not work. Also it tries to create a new private key whenever I launch the program. I installed the app from Gnome Software, the flatpak version. Fedora 43 Workstation. Below is the log that comes from when you launch it from the terminal. |
Beta Was this translation helpful? Give feedback.


Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Note
This discussion has been closed, for more information read the new article.
Beta Was this translation helpful? Give feedback.
All reactions