Skip to content

fix: use native Ollama API endpoints instead of OpenAI-compatible routes #7071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

roomote[bot]
Copy link

@roomote roomote bot commented Aug 14, 2025

This PR fixes issue #7070 where Ollama models were incorrectly using OpenAI-compatible routes instead of native Ollama API endpoints.

Problem

When using models like gpt-oss:120b with Ollama, the plugin was trying to get completions from OpenAI routes (/v1) instead of the native Ollama /api/chat endpoint.

Solution

  • Replaced the OpenAI client library with direct axios calls to Ollama's native API
  • Now using /api/chat endpoint for chat completions instead of /v1 OpenAI-compatible endpoint
  • Properly handle streaming responses from Ollama's native API format
  • Maintain backward compatibility with existing configurations

Changes Made

  1. Modified src/api/providers/ollama.ts:

    • Removed dependency on OpenAI client
    • Implemented direct HTTP calls to Ollama's /api/chat endpoint using axios
    • Added proper message format conversion from Anthropic to Ollama format
    • Implemented streaming response handling for Ollama's native format
    • Added proper error handling for Ollama-specific errors
  2. Updated tests:

    • src/api/providers/__tests__/ollama.spec.ts: Updated to mock axios instead of OpenAI client
    • src/api/providers/__tests__/ollama-timeout.spec.ts: Updated timeout tests to work with new implementation

Testing

  • All existing tests pass ✅
  • Timeout configuration tests pass ✅
  • Linting and type checking pass ✅

Fixes #7070


Important

This PR updates OllamaHandler to use native Ollama API endpoints with axios, replacing OpenAI-compatible routes, and updates tests to reflect these changes.

  • Behavior:
    • Replaces OpenAI-compatible routes with native Ollama API endpoints in OllamaHandler.
    • Uses /api/chat endpoint for chat completions.
    • Handles streaming responses and error scenarios specific to Ollama.
    • Maintains backward compatibility with existing configurations.
  • Implementation:
    • Removes OpenAI client dependency, uses axios for HTTP requests in ollama.ts.
    • Converts message formats from Anthropic to Ollama.
    • Implements error handling for connection and model not found errors.
  • Testing:
    • Updates ollama.spec.ts and ollama-timeout.spec.ts to mock axios and test new implementation.
    • Verifies timeout configurations and error handling.
    • Ensures all existing tests pass.

This description was created by Ellipsis for 14c33f8. You can customize this summary. It will automatically update as commits are pushed.

- Replace OpenAI client with direct axios calls to Ollama native API
- Use /api/chat endpoint instead of /v1 OpenAI-compatible endpoint
- Update tests to reflect the new native API implementation
- Maintain backward compatibility with existing configurations

Fixes #7070
@roomote roomote bot requested review from mrubens, cte and jr as code owners August 14, 2025 01:47
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. bug Something isn't working labels Aug 14, 2025
}) as const,
)

let buffer = ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable 'buffer' declared here isn't used; please remove for clarity.

Suggested change
let buffer = ""

for (const matcherChunk of matcher.update(delta.content)) {
yield matcherChunk

// Convert Anthropic messages to Ollama format
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline conversion of Anthropic messages to Ollama format (lines 82–117) could be extracted into a utility function for improved readability and maintainability.

This comment was generated because it violated a code review rule: irule_tTqpIuNs8DV0QFGj.

Copy link
Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code is like grading my own homework - I already know where I cut corners.

}) as const,
)

let buffer = ""
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this buffer variable intentional? It's declared but never used. Could we remove it to keep the code clean?

const ollamaMessages: OllamaMessage[] = [{ role: "system", content: systemPrompt }]

// Convert messages to Ollama format
for (const message of messages) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice we check for deepseek-r1 models and adjust temperature, but unlike the previous OpenAI implementation, we don't apply R1 format conversion to the messages. Is this intentional? DeepSeek R1 models might need special message formatting for optimal performance.

} else {
throw new Error(`Ollama completion error: ${error.message}`)
}
}
if (error instanceof Error) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error handling seems redundant - axios errors are already handled above. Could we remove this duplicate handler to simplify the error flow?

model: string
created_at: string
message: {
role: string
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using the more specific type "system" | "user" | "assistant" instead of string for better type safety?

apiKey: "ollama",
timeout: getApiRequestTimeout(),
})
this.baseUrl = this.options.ollamaBaseUrl || "http://localhost:11434"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default Ollama URL appears in multiple places. Would it be cleaner to extract this to a constant like const DEFAULT_OLLAMA_BASE_URL = "http://localhost:11434"?

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

Wrong routes for Ollama models
2 participants