Integrating local LLMs

Currently, the RAG SDK only supports hosted models. If we could enable the use of local models, similar to web-llm, that would be great. The only issue is that while they are OpenAI-compatible, they don’t provide an endpoint to query from. I believe we could write a simple HTTP server using bun.js and trigger it only if the user decides to use one of the local LLMs.

For example:

1. Start the model.
2. Hook it up to a simple server.
3. Generate a predefined URL and provide it to our base LLM client. And, add it like any other model.

The rest should be straightforward.

- [ ]  https://github.com/mlc-ai/web-llm
- [x]  https://ollama.com/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating local LLMs #51

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrating local LLMs #51

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions