Skip to content

Working with ollama on a local GPU #34

@colltoaction

Description

@colltoaction

Hi! I'm currently looking to set up a local environment with Ollama, running CUDA on my GPU. I have a few models running successfully including llama2 and nomic-embed-text.

I can run a chat session using the "LLM" mode, but I'm not able to use the rest. Different errors show up such as:

  • "Query failed: shapes (7,1536) and (768,) not aligned: 1536 (dim 1) != 768 (dim 0)" (in "RAG" and "Graph-RAG")
  • "Query failed: 'str' object has no attribute 'get'" (in Hyper-RAG).
  • "Query failed: 'NoneType' object has no attribute 'get'" (in Hyper-RAG-Lite).

I'm using WSL2 on Windows and was able to get here by mounting the following settings.json file in docker-compose.yaml:

{
    "apiKey": "ollama",
    "modelProvider": "settings.custom_api",
    "modelName": "llama2",
    "baseUrl": "http://host.docker.internal:11434/v1",
    "selectedDatabase": "",
    "maxTokens": 2000,
    "temperature": 0.7,
    "embeddingModel": "nomic-embed-text",
    "embeddingDim": 768
}

Is there anything I should check to set up a complete environment? If you don't have access to this environment, let me know what tests you need me to perform.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions