-
As I understand it, OpenBLAS must be provided by the user and then linked to koboldcpp, but I was hoping to find some sort of documentation on this process for Android. I have googled around and tried several linux install methods and they did not work on my system. I was wondering if the dev team had a better resource available for Android users? I am trying to get OpenBLAS working because I have been loving the 3B Red Pajama INCITE Chat model, but I am having a tough time when it comes time for the prompt to be reevaluated, with my ~20 second responses jumping to 60-80 second responses. I have already tried --highpriority and --smartcontext and while these settings did improve the overall times, I still see spikes when the context allotment runs out. I understand that even with BLAS I will have spikes but I am hoping it will shorten the context processing time significantly. Alternatively, is it possible to lower context below 512? With these small models it wouldn't be much of a hit, but something like 256 would still leave me with enough context to keep the bot on track during a conversation. At the very least it's a better option then wiping the context on every new chat. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
I found this: https://github.com/xianyi/OpenBLAS/wiki/How-to-build-OpenBLAS-for-Android, but for the life of me I cannot make it work. I'm assuming I need to do this on my PC, but I don't know where to run the export commands. I downloaded all the prerequisites, but I'm not sure how to use them and my ndk is not called ndk-bundle, it's just ndk, so I'm not sure if I've done that right. I have tried every cli available and export is not recognized as a command in any of them. I tried running the commands on my Android device but it throws a gcc error. I'm clearly not doing this right, but I'm very interested in learning how to make OpenBLAS work. Ok, I installed wsl and i can export now, but I still cannot get the linked instructions to work as i am missing a target and a makefile. I attempted to clone the repo and cd into the folder but still getting errors on make. I am also lacking gcc. |
Beta Was this translation helpful? Give feedback.
-
I have found a binary for ARMv7 on sourceforge, which I have extracted and renamed, but when running the linking command I receive 'ld.lld: error: unable to find library -lopenblas I have attempted to change the folder name and place the files directly in koboldcpp directory, but neither works. I have renamed the libopenblas.a file to libopenblas.lib and still receive the same error. |
Beta Was this translation helpful? Give feedback.
-
Yes you can reduce context further, the UI allows you to manually overwrite the limits by editing the number in the text box manually. As for building openblas on Android, I have not tried but it should technically be possible and I know of some people who have done it. However, you are unlikely to get much speed up for that kind of device. Also if you build it on your PC you might be targeting the wrong (x86) architectures. You would have to cross-compile it. |
Beta Was this translation helpful? Give feedback.
Yes you can reduce context further, the UI allows you to manually overwrite the limits by editing the number in the text box manually.
As for building openblas on Android, I have not tried but it should technically be possible and I know of some people who have done it. However, you are unlikely to get much speed up for that kind of device.
Also if you build it on your PC you might be targeting the wrong (x86) architectures. You would have to cross-compile it.