Error with Ollama

I am running using docker in ubuntu.

When I ask a question via AI assistant, I always get error like  below ( in docker logs ).  I am using ollama.
Ollama settings from .env below. 

```
# --- Ollama (inactive — uncomment để dùng, comment Gemini ở trên) ---
 LLM_PROVIDER=ollama
OLLAMA_HOST=http://host.docker.internal:11434
host.docker.internal
# Multimodal (Recommend)
# OLLAMA_MODEL=qwen3.5:9b
OLLAMA_MODEL=qwen3.5:4b
# OLLAMA_MODEL=gemma3:12b
# Text only model
#OLLAMA_MODEL=kamekichi128/qwen3-4b-instruct-2507
# Enable thinking mode for Ollama (default: false — reduces latency for thinking models like qwen3.5)
OLLAMA_ENABLE_THINKING=false
 ```
Ollama is working fine  using command line.  ollama is installed natively and not through docker container. Backend is able to communicate with ollama, but some error appears after some. before producing the final answer. 

I have updated the back end docker container with 
 extra_hosts:
      - "host.docker.internal:host-gateway"


If I enable, thinking, I can see thinking details appears in the UI.  But not getting any answer.   Answer is always

Unable to generate a response.

Docker logs below

```
OpenTelemetry is not enabled because it is missing from the config.
nexusrag-backend   | INFO:     127.0.0.1:36146 - "GET /health HTTP/1.1" 200 OK
nexusrag-backend   | INFO:     172.18.0.5:44628 - "POST /api/v1/rag/chat/1/stream HTTP/1.1" 200 OK
nexusrag-backend   | INFO:     127.0.0.1:57512 - "GET /health HTTP/1.1" 200 OK
nexusrag-backend   | INFO:     127.0.0.1:45768 - "GET /health HTTP/1.1" 200 OK
nexusrag-backend   | INFO:httpx:HTTP Request: POST http://host.docker.internal:11434/api/chat "HTTP/1.1 500 Internal Server Error"
nexusrag-backend   | ERROR:app.services.llm.ollama:Ollama streaming failed: EOF (status code: 500)
nexusrag-backend   | Traceback (most recent call last):
nexusrag-backend   |   File "/app/backend/app/services/llm/ollama.py", line 199, in astream
nexusrag-backend   |     async for chunk in stream:
nexusrag-backend   |   File "/usr/local/lib/python3.11/site-packages/ollama/_client.py", line 757, in inner
nexusrag-backend   |     raise ResponseError(e.response.text, e.response.status_code) from None
nexusrag-backend   | ollama._types.ResponseError: EOF (status code: 500)
nexusrag-backend   | WARNING:app.api.chat_agent:Ollama produced no text and no tool call — fallback to auto-search
nexusrag-backend   | INFO:app.services.embedder:Loading embedding model: BAAI/bge-m3
nexusrag-backend   | INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu
nexusrag-backend   | INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: BAAI/bge-m3
nexusrag-backend   | INFO:app.services.embedder:Embedding model loaded: BAAI/bge-m3 (dim=1024)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error with Ollama #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Error with Ollama #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions