Working with ollama on a local GPU

Hi! I'm currently looking to set up a local environment with Ollama, running CUDA on my GPU. I have a few models running successfully including `llama2` and `nomic-embed-text`.

I can run a chat session using the "LLM" mode, but I'm not able to use the rest. Different errors show up such as:
* "`Query failed: shapes (7,1536) and (768,) not aligned: 1536 (dim 1) != 768 (dim 0)`" (in "RAG" and "Graph-RAG")
* "`Query failed: 'str' object has no attribute 'get'`" (in Hyper-RAG).
* "`Query failed: 'NoneType' object has no attribute 'get'`" (in Hyper-RAG-Lite).

I'm using WSL2 on Windows and was able to get here by mounting the following `settings.json` file in `docker-compose.yaml`:

```json
{
    "apiKey": "ollama",
    "modelProvider": "settings.custom_api",
    "modelName": "llama2",
    "baseUrl": "http://host.docker.internal:11434/v1",
    "selectedDatabase": "",
    "maxTokens": 2000,
    "temperature": 0.7,
    "embeddingModel": "nomic-embed-text",
    "embeddingDim": 768
}
```

Is there anything I should check to set up a complete environment? If you don't have access to this environment, let me know what tests you need me to perform.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Working with ollama on a local GPU #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Working with ollama on a local GPU #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions