Skip to content

Conversation

daniel-lxs
Copy link
Collaborator

@daniel-lxs daniel-lxs commented Aug 15, 2025

Description

This PR fixes issue #7070 where Ollama models (like gpt-oss:120b) were incorrectly using OpenAI-compatible routes instead of native Ollama API endpoints.

Problem

When using models like gpt-oss:120b with Ollama, the plugin was trying to get completions from OpenAI routes (/v1) instead of the native Ollama API endpoint. This issue was reported by @LivioGama who noted that "Does not happen on Kilo Code".

Solution

After investigating, I discovered that Kilo-Org/kilocode had already solved this by using the official ollama npm package for native API access. This PR adapts their approach to our codebase.

Changes Made:

  1. Added the official ollama npm package (v0.5.17) as a dependency
  2. Created src/api/providers/native-ollama.ts - New handler using native Ollama SDK
  3. Updated src/api/index.ts - Switched to use NativeOllamaHandler for ollama provider
  4. Added comprehensive tests in src/api/providers/__tests__/native-ollama.spec.ts
  5. Maintained backward compatibility - Old OllamaHandler remains available but unused

Key Features:

  • Direct communication with Ollama's native API
  • Proper error handling for Ollama-specific scenarios (service not running, model not found)
  • Support for streaming responses with token usage tracking
  • DeepSeek R1 reasoning detection support
  • No more OpenAI compatibility layer overhead

Testing

  • ✅ All new unit tests pass
  • ✅ TypeScript compilation succeeds
  • ✅ Linting passes

Credits

This solution was inspired by Kilo-Org/kilocode's implementation.

Related

Breaking Changes

None - the change is transparent to users and maintains full backward compatibility.


Important

Replaces OpenAI compatibility layer with native Ollama API for Ollama models, introducing NativeOllamaHandler and adding comprehensive tests.

  • Behavior:
    • Replaces OpenAI compatibility layer with native Ollama API for Ollama models in src/api/index.ts.
    • Introduces NativeOllamaHandler in native-ollama.ts for direct API communication.
    • Maintains backward compatibility by keeping old OllamaHandler.
  • Testing:
    • Adds tests for NativeOllamaHandler in native-ollama.spec.ts.
    • Tests cover message streaming, prompt completion, and error handling.
  • Dependencies:
    • Adds ollama npm package (v0.5.17) to package.json.

This description was created by Ellipsis for 5c83d3a. You can customize this summary. It will automatically update as commits are pushed.

- Implements native Ollama API using the official ollama npm package
- Fixes issue #7070 where models like gpt-oss:120b failed with OpenAI routes
- Based on the approach successfully used by Kilo-Org/kilocode
- Maintains backward compatibility by keeping old handler available
- Adds comprehensive tests for the new implementation

Credits: Solution inspired by Kilo-Org/kilocode's implementation

Fixes #7070
@daniel-lxs daniel-lxs requested review from mrubens, cte and jr as code owners August 15, 2025 22:41
@daniel-lxs daniel-lxs self-assigned this Aug 15, 2025
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Aug 15, 2025
src/api/index.ts Outdated
@@ -13,7 +13,7 @@ import {
VertexHandler,
AnthropicVertexHandler,
OpenAiHandler,
OllamaHandler,
// OllamaHandler, // Replaced with NativeOllamaHandler
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the commented-out 'OllamaHandler' import if it’s no longer needed.

Suggested change
// OllamaHandler, // Replaced with NativeOllamaHandler

This comment was generated because it violated a code review rule: irule_Vw7dJWzvznOJagxS.

import { BaseProvider } from "./base-provider"
import type { ApiHandlerOptions } from "../../shared/api"
import { getOllamaModels } from "./fetchers/ollama"
import { getApiRequestTimeout } from "./utils/timeout-config"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused import 'getApiRequestTimeout' to clean up the code.

Suggested change
import { getApiRequestTimeout } from "./utils/timeout-config"

Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and found that the implementation is solid with good error handling and test coverage. I have some suggestions for improvement that could make the code even cleaner.

import { BaseProvider } from "./base-provider"
import type { ApiHandlerOptions } from "../../shared/api"
import { getOllamaModels } from "./fetchers/ollama"
import { getApiRequestTimeout } from "./utils/timeout-config"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this import needed? I don't see getApiRequestTimeout being used anywhere in this file. Could we remove it to keep the imports clean?

return ollamaMessages
}

const OLLAMA_TIMEOUT_MS = 3_600_000
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant OLLAMA_TIMEOUT_MS is defined but never used. Is this intentional for future use, or should we either use it (perhaps pass it to the Ollama client configuration) or remove it?

}
}

async fetchModel() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice the fetchModel() method fetches models but doesn't seem to be called anywhere in the codebase. Is this intentional for future use, or should it be integrated into the initialization flow to populate the models cache?

if (part.type === "image") {
// Handle base64 images only (Anthropic SDK uses base64)
if ("source" in part && part.source.type === "base64") {
return `data:${part.source.media_type};base64,${part.source.data}`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image handling here returns a data URL string directly in the content field. Could we verify this is the correct format expected by Ollama's API for image inputs? The comment mentions base64 handling, but I want to make sure this aligns with Ollama's expectations.

expect(model.info).toBeDefined()
})
})
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test coverage! Consider adding a few more edge case tests:

  • Image message conversion (testing the base64 image handling)
  • Tool message handling
  • Multiple concurrent requests
  • Network timeout scenarios

These would help ensure robustness in production scenarios.

- Remove unused imports (getApiRequestTimeout) and constants (OLLAMA_TIMEOUT_MS)
- Call fetchModel() in createMessage and completePrompt to load model info
- Fix image handling to use raw base64 strings instead of data URLs
- Remove commented-out OllamaHandler import
- Properly separate text and images in message conversion
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Aug 16, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 16, 2025
@mrubens mrubens merged commit f3864ff into main Aug 16, 2025
10 checks passed
@mrubens mrubens deleted the fix/ollama-native-api-7070 branch August 16, 2025 03:32
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Aug 16, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 16, 2025
fxcl added a commit to tameslabs/Roo-Cline that referenced this pull request Aug 16, 2025
* main: (70 commits)
  fix: use native Ollama API instead of OpenAI compatibility layer (RooCodeInc#7137)
  feat: add support for OpenAI gpt-5-chat-latest model (RooCodeInc#7058)
  Make enhance with task history default to true (RooCodeInc#7140)
  Bump cloud version to 0.16.0 (RooCodeInc#7135)
  Release: v1.51.0 (RooCodeInc#7130)
  Add an API for resuming tasks by ID (RooCodeInc#7122)
  Add support for task page event population (RooCodeInc#7117)
  fix: add type check before calling .match() on diffItem.content (RooCodeInc#6905) (RooCodeInc#6906)
  Fix: Enable save button for provider dropdown and checkbox changes (RooCodeInc#7113)
  fix: Use cline.cwd as primary source for workspace path in codebaseSearchTool (RooCodeInc#6902)
  Hotfix multiple folder workspace checkpoint (RooCodeInc#6903)
  fix: prevent XML entity decoding in diff tools (RooCodeInc#7107) (RooCodeInc#7108)
  Refactor task execution system: improve call stack management (RooCodeInc#7035)
  Changeset version bump (RooCodeInc#7104)
  feat(web): fill missing SEO-related values (RooCodeInc#7096)
  Update contributors list (RooCodeInc#6883)
  Release v3.25.15 (RooCodeInc#7103)
  fix: add /evals page to sitemap generation (RooCodeInc#7102)
  feat: implement sitemap generation in TypeScript and remove XML file (RooCodeInc#6206)
  fix: reset condensing state when switching tasks (RooCodeInc#6922)
  ...
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Wrong routes for Ollama models
3 participants