[feat] [doc improvement]: Where does context chat backend get its compute?

**Describe the feature you'd like to request**
I finally got context chat working, and it has tremendous potential. I am running our server on a digitalocean droplet, and do our computing on a separate ollama server.

I looked at the configuration file `/nc_app_context_chat_backend_data/config.yaml`, but I don't have the skills to understand where the computing is happening. If I had to guess, I would guess it's all happening locally. For sure any request takes a very long time (up to 60 minutes) and doesn't seem to register on the GPU server. I can't tell, and the documentation doesn't help too much.

**Describe the solution you'd like**
1. Please take a moment to explain to me where models are being computed
2. If it is as I suspect and everything happens local, then a small section in the documentation that explains this would be useful - I'm happy to write this and submit a PR
3. Is it possible to use other computing resources? If so, how?
4. If yes to above, then this should also be documented - again happy to write this and submit a PR

**Describe alternatives you've considered**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] [doc improvement]: Where does context chat backend get its compute? #233

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feat] [doc improvement]: Where does context chat backend get its compute? #233

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions