-
Hello, I was under the impression that running a horde worker enabled one to offload a model just like offloading some ,but not all, layers to a local GPU. But the more I read, it seems each worker resources are self contained and the only splitting one can do is the prompt/context itself that could be sent to different workers and collated back in the client, please help me understand. The modes I see on the docs is remote access through the browser, so basically the same as running locally but the address is from a different machine running the server OR running a horde worker after getting a key from the site and the local running instance keeps polling the server for requests. I do see mention of not using the api key for locally running horde but the farther I was able to get was setting up a --nomodel instance in 1 machine and a --host instance (with a model) on another, then connecting the browser to the 1st one, it says it has no model and asks for connecting to an "ai provider", which I direct to the 2nd machine. This works but, like I said above, it seems to use only the 2nd machine as self contained resources. Now granted, this sounds a little backward as the machine with no model doesn't seem to have a way to tell the other one about its resources but then again a horde worker only has 2 parameters for that, namely Gen.Length and Max Context. What I'd like to be able to do is having the koboldcpp that is going to act as client have available to it whatever the horde worker serves before it loads the model, so it can decide how many layers it can offload, just like it does with a local GPU. Is that even possible? Or at least have, by trial and error like the docs suggest, send n layers until finding an appropriate n. (over the network though) Another possibility is using this https://github.com/db0/KoboldAI-Horde-Bridge Any pointer in the right direction would be greatly appreciated. If this is not currently possible, tell me what the 1st few steps to add that to the code would be and I'll try to help implementing that, sounds like most of the infrastructure is already there. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
I think you have a misunderstanding of what Horde is. Let's clarify some terminology.
|
Beta Was this translation helpful? Give feedback.
yes
This is sort of deprecated. In the past, people used to use external horde worker scripts to run the horde worker as a separate program. These days, everyone uses the built in KoboldCpp integrated horde worker, so no external scripts…