diff --git a/OLLAMA_SETUP.md b/OLLAMA_SETUP.md
new file mode 100644
index 00000000000..6fc34c0cd6e
--- /dev/null
+++ b/OLLAMA_SETUP.md
@@ -0,0 +1,232 @@
+# Ollama Integration with Remix IDE
+
+This guide explains how to set up and use Ollama with Remix IDE for local AI-powered code completion and assistance. Note the restrictions listed below.
+
+## Table of Contents
+- [What is Ollama?](#what-is-ollama)
+- [Installation](#installation)
+- [CORS Configuration](#cors-configuration)
+- [Model Download and Management](#model-download-and-management)
+- [Recommended Models](#recommended-models)
+- [Using Ollama in Remix IDE](#using-ollama-in-remix-ide)
+- [Troubleshooting](#troubleshooting)
+- [Advanced Configuration](#advanced-configuration)
+
+## What is Ollama?
+
+Ollama is a local AI model runner that allows you to run large language models on your own machine. With Remix IDE's Ollama integration, you get:
+
+- **Privacy**: All processing happens locally on your machine
+- **No API rate throttling**: No usage fees or rate limits
+- **Offline capability**: Works without internet connection
+- **Code-optimized models**: Specialized models for coding tasks
+- **Fill-in-Middle (FIM) support**: Advanced code completion capabilities
+
+## Restrictions
+The current integration does not allow agentic workflows. We strongly recommend running Ollama with hardware acceleration (e.g. GPUs) for best experience. The following features are not enabled when using Ollama, please fallback to remote providers.
+- **Contract generation**
+- **Workspace Edits**
+
+## Installation
+
+### Step 1: Install Ollama
+
+**macOS:**
+```bash
+curl -fsSL https://ollama.ai/install.sh | sh
+```
+
+**Windows:**
+Download the installer from [ollama.ai](https://ollama.ai/download/windows)
+
+**Linux:**
+```bash
+curl -fsSL https://ollama.ai/install.sh | sh
+```
+
+### Step 2: Start Ollama Service
+
+After installation, start the Ollama service:
+
+```bash
+ollama serve
+```
+
+The service will run on `http://localhost:11434` by default.
+
+## CORS Configuration
+
+To allow Remix IDE to communicate with Ollama, you need to configure CORS settings.
+See [Ollama Cors Settings](https://objectgraph.com/blog/ollama-cors/).
+## Model Download and Management
+
+### Downloading Models
+
+Use the `ollama pull` command to download models:
+
+```bash
+# Download a specific model
+ollama pull qwen2.5-coder:14b
+
+# Download the latest version
+ollama pull codestral:latest
+
+
+```
+
+### Managing Models
+
+```bash
+# List installed models
+ollama list
+
+# Remove a model
+ollama rm model-name
+
+# Show model information
+ollama show codestral:latest <--template>
+
+# Update a model
+ollama pull codestral:latest
+```
+
+### Model Storage Locations
+
+Models are stored locally in:
+- **macOS:** `~/.ollama/models`
+- **Linux:** `~/.ollama/models`
+- **Windows:** `%USERPROFILE%\.ollama\models`
+
+## Recommended Models
+
+### For Code Completion (Fill-in-Middle Support)
+
+These models support advanced code completion with context awareness, code explanation, debugging help, and general questions:
+
+#### **Codestral (Excellent for Code)**
+```bash
+ollama pull codestral:latest    # ~22GB, state-of-the-art code model
+```
+
+#### **Quen Coder**
+```bash
+ollama pull qwen2.5-coder:14b
+ollama pull qwen2.5-coder:3b
+```
+
+#### **Code Gemma**
+```bash
+ollama pull codegemma:7b        # ~5GB, Google's code model
+ollama pull codegemma:2b        # ~2GB, lightweight option
+```
+
+### Model Size and Performance Guide
+
+| Model Size | RAM Required | Speed | Quality | Use Case |
+|------------|--------------|-------|---------|----------|
+| 2B-3B      | 4GB+         | Fast  | Good    | Quick completions, low-end hardware |
+| 7B-8B      | 8GB+         | Medium| High    | **Recommended for most users** |
+| 13B-15B    | 16GB+        | Slower| Higher  | Development workstations |
+| 30B+       | 32GB+        | Slow  | Highest | High-end workstations only |
+
+## Using Ollama in Remix IDE
+
+### Step 1: Verify Ollama is Running
+
+Ensure Ollama is running and accessible:
+```bash
+curl http://localhost:11434/api/tags
+```
+
+### Step 2: Select Ollama in Remix IDE
+
+1. Open Remix IDE
+2. Navigate to the AI Assistant panel
+3. Click the provider selector (shows current provider like "MistralAI")
+4. Select "Ollama" from the dropdown
+5. Wait for the connection to establish
+
+### Step 3: Choose Your Model
+
+1. After selecting Ollama, a model dropdown will appear
+2. Select your preferred model from the list
+3. The selection will be saved for future sessions
+
+### Step 4: Start Using AI Features
+
+- **Code Completion**: Type code and get intelligent completions
+- **Code Explanation**: Ask questions about your code
+- **Error Help**: Get assistance with debugging
+- **Code Generation**: Generate code from natural language descriptions
+
+## Troubleshooting
+
+### Common Issues
+
+#### **"Ollama is not available" Error**
+
+1. Check if Ollama is running:
+   ```bash
+   curl http://localhost:11434/api/tags
+   ```
+
+2. Verify CORS configuration:
+   ```bash
+   curl -H "Origin: https://remix.ethereum.org" http://localhost:11434/api/tags
+   ```
+
+3. Check if models are installed:
+   ```bash
+   ollama list
+   ```
+
+#### **No Models Available**
+
+Download at least one model:
+```bash
+ollama pull codestral:latest
+```
+
+#### **Connection Refused**
+
+1. Start Ollama service:
+   ```bash
+   ollama serve
+   ```
+
+2. Check if running on correct port:
+   ```bash
+   netstat -an | grep 11434
+   ```
+
+#### **Model Loading Slow**
+
+- Close other applications to free up RAM
+- Use smaller models (7B instead of 13B+)
+- Ensure sufficient disk space
+
+#### **CORS Errors in Browser Console**
+
+1. Verify `OLLAMA_ORIGINS` is set correctly
+2. Restart Ollama after changing CORS settings
+3. Clear browser cache and reload Remix IDE
+
+### Performance Optimization
+
+#### **Hardware Recommendations**
+
+- **Minimum**: 8GB RAM, integrated GPU
+- **Recommended**: 16GB RAM, dedicated GPU with 8GB+ VRAM
+- **Optimal**: 32GB RAM, RTX 4090 or similar
+
+
+## Getting Help
+
+- **Ollama Documentation**: [https://ollama.ai/docs](https://ollama.ai/docs)
+- **Remix IDE Documentation**: [https://remix-ide.readthedocs.io](https://remix-ide.readthedocs.io)
+- **Community Support**: Remix IDE Discord/GitHub Issues
+- **Model Hub**: [https://ollama.ai/library](https://ollama.ai/library)
+
+---
+
+**Note**: This integration provides local AI capabilities for enhanced privacy and performance. Model quality and speed depend on your hardware specifications and chosen models.
\ No newline at end of file
diff --git a/apps/remix-ide/src/app/plugins/remixAIPlugin.tsx b/apps/remix-ide/src/app/plugins/remixAIPlugin.tsx
index 56ddc22d894..e8c0969ec0c 100644
--- a/apps/remix-ide/src/app/plugins/remixAIPlugin.tsx
+++ b/apps/remix-ide/src/app/plugins/remixAIPlugin.tsx
@@ -1,6 +1,6 @@
 import * as packageJson from '../../../../../package.json'
 import { Plugin } from '@remixproject/engine';
-import { IModel, RemoteInferencer, IRemoteModel, IParams, GenerationParams, AssistantParams, CodeExplainAgent, SecurityAgent, CompletionParams } from '@remix/remix-ai-core';
+import { IModel, RemoteInferencer, IRemoteModel, IParams, GenerationParams, AssistantParams, CodeExplainAgent, SecurityAgent, CompletionParams, OllamaInferencer, isOllamaAvailable, getBestAvailableModel } from '@remix/remix-ai-core';
 import { CodeCompletionAgent, ContractAgent, workspaceAgent, IContextType } from '@remix/remix-ai-core';
 import axios from 'axios';
 import { endpointUrls } from "@remix-endpoints-helper"
@@ -18,7 +18,7 @@ const profile = {
     "code_insertion", "error_explaining", "vulnerability_check", 'generate',
     "initialize", 'chatPipe', 'ProcessChatRequestBuffer', 'isChatRequestPending',
     'resetChatRequestBuffer', 'setAssistantThrId',
-    'getAssistantThrId', 'getAssistantProvider', 'setAssistantProvider'],
+    'getAssistantThrId', 'getAssistantProvider', 'setAssistantProvider', 'setModel'],
   events: [],
   icon: 'assets/img/remix-logo-blue.png',
   description: 'RemixAI provides AI services to Remix IDE.',
@@ -361,11 +361,79 @@ export class RemixAIPlugin extends Plugin {
         AssistantParams.threadId = ''
       }
       this.assistantProvider = provider
+
+      // Switch back to remote inferencer for cloud providers -- important
+      if (this.remoteInferencer && this.remoteInferencer instanceof OllamaInferencer) {
+        this.remoteInferencer = new RemoteInferencer()
+        this.remoteInferencer.event.on('onInference', () => {
+          this.isInferencing = true
+        })
+        this.remoteInferencer.event.on('onInferenceDone', () => {
+          this.isInferencing = false
+        })
+      }
+    } else if (provider === 'ollama') {
+      const isAvailable = await isOllamaAvailable();
+      if (!isAvailable) {
+        console.error('Ollama is not available. Please ensure Ollama is running.')
+        return
+      }
+
+      const bestModel = await getBestAvailableModel();
+      if (!bestModel) {
+        console.error('No Ollama models available. Please install a model first.')
+        return
+      }
+
+      // Switch to Ollama inferencer
+      this.remoteInferencer = new OllamaInferencer(bestModel);
+      this.remoteInferencer.event.on('onInference', () => {
+        this.isInferencing = true
+      })
+      this.remoteInferencer.event.on('onInferenceDone', () => {
+        this.isInferencing = false
+      })
+
+      if (this.assistantProvider !== provider){
+        // clear the threadIds
+        this.assistantThreadId = ''
+        GenerationParams.threadId = ''
+        CompletionParams.threadId = ''
+        AssistantParams.threadId = ''
+      }
+      this.assistantProvider = provider
+      console.log(`Ollama provider set with model: ${bestModel}`)
     } else {
       console.error(`Unknown assistant provider: ${provider}`)
     }
   }
 
+  async setModel(modelName: string) {
+    if (this.assistantProvider === 'ollama' && this.remoteInferencer instanceof OllamaInferencer) {
+      try {
+        const isAvailable = await isOllamaAvailable();
+        if (!isAvailable) {
+          console.error('Ollama is not available. Please ensure Ollama is running.')
+          return
+        }
+
+        this.remoteInferencer = new OllamaInferencer(modelName);
+        this.remoteInferencer.event.on('onInference', () => {
+          this.isInferencing = true
+        })
+        this.remoteInferencer.event.on('onInferenceDone', () => {
+          this.isInferencing = false
+        })
+
+        console.log(`Ollama model changed to: ${modelName}`)
+      } catch (error) {
+        console.error('Failed to set Ollama model:', error)
+      }
+    } else {
+      console.warn(`setModel is only supported for Ollama provider. Current provider: ${this.assistantProvider}`)
+    }
+  }
+
   isChatRequestPending(){
     return this.chatRequestBuffer != null
   }
diff --git a/libs/remix-ai-core/src/agents/contractAgent.ts b/libs/remix-ai-core/src/agents/contractAgent.ts
index 4a32b84086e..914b162a1d4 100644
--- a/libs/remix-ai-core/src/agents/contractAgent.ts
+++ b/libs/remix-ai-core/src/agents/contractAgent.ts
@@ -36,6 +36,7 @@ export class ContractAgent {
   async writeContracts(payload, userPrompt, statusCallback?: (status: string) => Promise<void>) {
     await statusCallback?.('Getting current workspace info...')
     const currentWorkspace = await this.plugin.call('filePanel', 'getCurrentWorkspace')
+    console.log('AI generated result', payload)
 
     const writeAIResults = async (parsedResults) => {
       if (this.plugin.isOnDesktop) {
@@ -78,14 +79,14 @@ export class ContractAgent {
           }
           return "Max attempts reached! Please try again with a different prompt."
         }
-        return "No payload, try again while considering changing the assistant provider to one of these choices `<openai|anthropic|mistralai>`"
+        return "No payload, try again while considering changing the assistant provider with the command `/setAssistant <openai|anthropic|mistralai|ollama>`"
       }
 
       await statusCallback?.('Processing generated files...')
       this.contracts = {}
       const parsedFiles = payload
       this.oldPayload = payload
-      this.generationThreadID = parsedFiles['threadID']
+      this.generationThreadID = "" //parsedFiles['threadID']
       this.workspaceName = parsedFiles['projectName']
 
       this.nAttempts += 1
@@ -133,6 +134,7 @@ export class ContractAgent {
       await statusCallback?.('Finalizing workspace creation...')
       return result.compilationSucceeded ? await writeAIResults(parsedFiles) : await writeAIResults(parsedFiles) + "\n\n" + COMPILATION_WARNING_MESSAGE
     } catch (error) {
+      console.log('error - ', error)
       await statusCallback?.('Error occurred, cleaning up...')
       this.deleteWorkspace(this.workspaceName )
       this.nAttempts = 0
diff --git a/libs/remix-ai-core/src/helpers/chatCommandParser.ts b/libs/remix-ai-core/src/helpers/chatCommandParser.ts
index 4585e21878a..f72f0ae8636 100644
--- a/libs/remix-ai-core/src/helpers/chatCommandParser.ts
+++ b/libs/remix-ai-core/src/helpers/chatCommandParser.ts
@@ -1,4 +1,4 @@
-import { isOllamaAvailable, listModels } from "../inferencers/local/ollama";
+import { isOllamaAvailable, listModels, getBestAvailableModel, validateModel, getOllamaHost } from "../inferencers/local/ollama";
 import { OllamaInferencer } from "../inferencers/local/ollamaInferencer";
 import { GenerationParams } from "../types/models";
 
@@ -18,6 +18,7 @@ export class ChatCommandParser {
     this.register("@workspace", this.handleWorkspace);
     this.register("@setAssistant", this.handleAssistant);
     this.register("@ollama", this.handleOllama);
+    this.register("/ollama", this.handleOllama);
     this.register("/generate", this.handleGenerate);
     this.register("/g", this.handleGenerate);
     this.register("/workspace", this.handleWorkspace);
@@ -73,13 +74,16 @@ export class ChatCommandParser {
     }
   }
 
-  private async handleAssistant(provider: string, ref, statusCallback?: (status: string) => Promise<void>) {
-    if (provider === 'openai' || provider === 'mistralai' || provider === 'anthropic') {
-      await statusCallback?.('Setting AI provider...')
-      await ref.props.call('remixAI', 'setAssistantProvider', provider);
-      return "AI Provider set to `" + provider + "` successfully! "
+  private async handleAssistant(provider: string, ref) {
+    if (provider === 'openai' || provider === 'mistralai' || provider === 'anthropic' || provider === 'ollama') {
+      try {
+        await ref.props.call('remixAI', 'setAssistantProvider', provider);
+        return "AI Provider set to `" + provider + "` successfully! "
+      } catch (error) {
+        return `Failed to set AI Provider to \`${provider}\`: ${error.message || error}`
+      }
     } else {
-      return "Invalid AI Provider. Please use `openai`, `mistralai`, or `anthropic`."
+      return "Invalid AI Provider. Please use `openai`, `mistralai`, `anthropic`, or `ollama`."
     }
   }
 
@@ -88,41 +92,112 @@ export class ChatCommandParser {
       if (prompt === "start") {
         const available = await isOllamaAvailable();
         if (!available) {
-          return '❌ Ollama is not available. Consider enabling the (Ollama CORS)[https://objectgraph.com/blog/ollama-cors/]'
+          return 'Ollama is not available on any of the default ports (11434, 11435, 11436). Please ensure Ollama is running and CORS is enabled: https://objectgraph.com/blog/ollama-cors/';
         }
+
+        const host = getOllamaHost();
         const models = await listModels();
-        const res = "Available models: " + models.map((model: any) => `\`${model}\``).join("\n");
-        return res + "\n\nOllama is now set up. You can use the command `/ollama select <model>` to start a conversation with a specific model. Make sure the model is being run on your local machine. See ollama run <model> for more details.";
+        const bestModel = await getBestAvailableModel();
+
+        let response = `Ollama discovered on ${host}\n\n`;
+        response += `Available models (${models.length}):\n`;
+        response += models.map((model: any) => `• \`${model}\``).join("\n");
+
+        if (bestModel) {
+          response += `\n\nRecommended model: \`${bestModel}\``;
+        }
+
+        response += "\n\nCommands:\n";
+        response += "• `/ollama select <model>` - Select a specific model\n";
+        response += "• `/ollama auto` - Auto-select best available model\n";
+        response += "• `/ollama status` - Check current status\n";
+        response += "• `/ollama stop` - Stop Ollama integration";
+
+        return response;
       } else if (prompt.trimStart().startsWith("select")) {
         const model = prompt.split(" ")[1];
         if (!model) {
-          return "Please provide a model name to select.";
+          return "Please provide a model name to select.\nExample: `/ollama select llama2:7b`";
+        }
+
+        const available = await isOllamaAvailable();
+        if (!available) {
+          return 'Ollama is not available. Please ensure it is running and try `/ollama start` first.';
+        }
+
+        const isValid = await validateModel(model);
+        if (!isValid) {
+          const models = await listModels();
+          return `Model \`${model}\` is not available.\n\nAvailable models:\n${models.map(m => `• \`${m}\``).join("\n")}`;
+        }
+
+        // instantiate ollama with selected model
+        ref.props.remoteInferencer = new OllamaInferencer(model);
+        ref.props.remoteInferencer.event.on('onInference', () => {
+          ref.props.isInferencing = true;
+        });
+        ref.props.remoteInferencer.event.on('onInferenceDone', () => {
+          ref.props.isInferencing = false;
+        });
+
+        return `Model set to \`${model}\`. You can now start chatting with it.`;
+      } else if (prompt === "auto") {
+        const available = await isOllamaAvailable();
+        if (!available) {
+          return 'Ollama is not available. Please ensure it is running and try `/ollama start` first.';
         }
+
+        const bestModel = await getBestAvailableModel();
+        if (!bestModel) {
+          return 'No models available. Please install a model first using `ollama pull <model-name>`.';
+        }
+
+        ref.props.remoteInferencer = new OllamaInferencer(bestModel);
+        ref.props.remoteInferencer.event.on('onInference', () => {
+          ref.props.isInferencing = true;
+        });
+        ref.props.remoteInferencer.event.on('onInferenceDone', () => {
+          ref.props.isInferencing = false;
+        });
+
+        return `Auto-selected model: \`${bestModel}\`. You can now start chatting with it.`;
+      } else if (prompt === "status") {
         const available = await isOllamaAvailable();
         if (!available) {
-          return '❌ Ollama is not available. Consider enabling the (Ollama CORS)[https://objectgraph.com/blog/ollama-cors/]'
+          return 'Ollama is not available on any of the default ports.';
         }
+
+        const host = getOllamaHost();
         const models = await listModels();
-        if (models.includes(model)) {
-          // instantiate ollama in remixai
-          ref.props.remoteInferencer = new OllamaInferencer()
-          ref.props.remoteInferencer.event.on('onInference', () => {
-            ref.props.isInferencing = true
-          })
-          ref.props.remoteInferencer.event.on('onInferenceDone', () => {
-            ref.props.isInferencing = false
-          })
-          return `Model set to \`${model}\`. You can now start chatting with it.`;
+        const currentModel = ref.props.remoteInferencer?.model_name || 'None selected';
+
+        let response = `Ollama Status:\n`;
+        response += `• Host: ${host}\n`;
+        response += `• Available models: ${models.length}\n`;
+        response += `• Current model: \`${currentModel}\`\n`;
+        response += `• Integration: ${ref.props.remoteInferencer ? 'Active' : 'Inactive'}`;
+
+        return response;
+      } else if (prompt === "stop") {
+        if (ref.props.remoteInferencer) {
+          ref.props.remoteInferencer = null;
+          ref.props.initialize()
+
+          return "Ollama integration stopped. Switched back to remote inference.";
         } else {
-          return `Model \`${model}\` is not available. Please check the list of available models.`;
+          return "ℹOllama integration is not currently active.";
         }
-      } else if (prompt === "stop") {
-        return "Ollama generation stopped.";
       } else {
-        return "Invalid command. Use `/ollama start` to initialize Ollama, `/ollama select <model>` to select a model, or `/ollama stop` to stop the generation.";
+        return `Invalid command. Available commands:
+• \`/ollama start\` - Initialize and discover Ollama
+• \`/ollama select <model>\` - Select a specific model
+• \`/ollama auto\` - Auto-select best available model
+• \`/ollama status\` - Check current status
+• \`/ollama stop\` - Stop Ollama integration`;
       }
     } catch (error) {
-      return "Ollama generation failed. Please try again.";
+      console.error("Ollama command error:", error);
+      return `Ollama command failed: ${error.message || 'Unknown error'}. Please try again.`;
     }
   }
 }
diff --git a/libs/remix-ai-core/src/helpers/streamHandler.ts b/libs/remix-ai-core/src/helpers/streamHandler.ts
index c3ff2d97948..76d68a3e933 100644
--- a/libs/remix-ai-core/src/helpers/streamHandler.ts
+++ b/libs/remix-ai-core/src/helpers/streamHandler.ts
@@ -206,3 +206,59 @@ export const HandleAnthropicResponse = async (streamResponse, cb: (streamText: s
     }
   }
 }
+
+export const HandleOllamaResponse = async (streamResponse: any, cb: (streamText: string) => void, done_cb?: (result: string) => void) => {
+  const reader = streamResponse.body?.getReader();
+  const decoder = new TextDecoder("utf-8");
+  let resultText = "";
+
+  if (!reader) { // normal response, not a stream
+    cb(streamResponse.result || streamResponse.response || "");
+    done_cb?.(streamResponse.result || streamResponse.response || "");
+    return;
+  }
+
+  try {
+    // eslint-disable-next-line no-constant-condition
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+
+      const chunk = decoder.decode(value, { stream: true });
+      const lines = chunk.split('\n').filter(line => line.trim());
+
+      for (const line of lines) {
+        try {
+          const parsed = JSON.parse(line);
+          let content = "";
+
+          if (parsed.response) {
+            // For /api/generate endpoint
+            content = parsed.response;
+          } else if (parsed.message?.content) {
+            // For /api/chat endpoint
+            content = parsed.message.content;
+          }
+
+          if (content) {
+            cb(content);
+            resultText += content;
+          }
+
+          if (parsed.done) {
+            done_cb?.(resultText);
+            return;
+          }
+        } catch (parseError) {
+          console.warn("⚠️ Ollama: Skipping invalid JSON line:", line);
+          continue;
+        }
+      }
+    }
+
+    done_cb?.(resultText);
+  } catch (error) {
+    console.error("⚠️ Ollama Stream error:", error);
+    done_cb?.(resultText);
+  }
+}
diff --git a/libs/remix-ai-core/src/helpers/textSanitizer.ts b/libs/remix-ai-core/src/helpers/textSanitizer.ts
new file mode 100644
index 00000000000..d41552dc26d
--- /dev/null
+++ b/libs/remix-ai-core/src/helpers/textSanitizer.ts
@@ -0,0 +1,94 @@
+export function sanitizeCompletionText(text: string): string {
+  if (!text || typeof text !== 'string') {
+    return '';
+  }
+
+  let sanitized = text;
+
+  // Extract content from markdown code blocks (```language ... ```)
+  const codeBlockRegex = /```[\w]*\n?([\s\S]*?)```/g;
+  const codeBlocks: string[] = [];
+  let match: RegExpExecArray | null;
+
+  while ((match = codeBlockRegex.exec(text)) !== null) {
+    codeBlocks.push(match[1].trim());
+  }
+
+  // If code blocks are found, return only the code content
+  if (codeBlocks.length > 0) {
+    return codeBlocks.join('\n\n');
+  }
+
+  // If no code blocks found, proceed with general sanitization
+  // Remove any remaining markdown code block markers
+  sanitized = sanitized.replace(/```[\w]*\n?/g, '');
+
+  // Remove inline code markers (`code`)
+  sanitized = sanitized.replace(/`([^`]+)`/g, '$1');
+
+  // Remove markdown headers (# ## ### etc.)
+  sanitized = sanitized.replace(/^#{1,6}\s+/gm, '');
+
+  // Remove markdown bold/italic (**text** or *text* or __text__ or _text_)
+  // but preserve math expressions like 10**decimalsValue
+  sanitized = sanitized.replace(/(\*\*|__)([^*_]*?)\1/g, '$2');
+  sanitized = sanitized.replace(/(?<!\w)(\*|_)([^*_]*?)\1(?!\w)/g, '$2');
+
+  // Remove markdown links [text](url)
+  sanitized = sanitized.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1');
+
+  // Remove HTML tags
+  sanitized = sanitized.replace(/<[^>]*>/g, '');
+
+  // Remove common explanation phrases that aren't code
+  const explanationPatterns = [
+    /^Here's.*?:\s*/i,
+    /^This.*?:\s*/i,
+    /^The.*?:\s*/i,
+    /^To.*?:\s*/i,
+    /^You can.*?:\s*/i,
+    /^I'll.*?:\s*/i,
+    /^Let me.*?:\s*/i,
+    /^First.*?:\s*/i,
+    /^Now.*?:\s*/i,
+    /^Next.*?:\s*/i,
+    /^Finally.*?:\s*/i,
+    /^Note:.*$/gmi,
+    /^Explanation:.*$/gmi,
+    /^Example:.*$/gmi
+  ];
+
+  explanationPatterns.forEach(pattern => {
+    sanitized = sanitized.replace(pattern, '');
+  });
+
+  // Only filter out obvious explanatory lines, be more permissive for code
+  const lines = sanitized.split('\n');
+  const filteredLines = lines.filter(line => {
+    const trimmedLine = line.trim();
+
+    // Keep empty lines for code formatting
+    if (!trimmedLine) return true;
+
+    // Skip lines that are clearly explanatory text (be more conservative)
+    const obviousExplanatoryPatterns = [
+      /^(Here's|Here is|This is|The following|You can|I'll|Let me)\s/i,
+      /^(Explanation|Example|Note):\s/i,
+      /^(To complete|To fix|To add|To implement)/i,
+      /\s+explanation\s*$/i
+    ];
+
+    const isObviousExplanation = obviousExplanatoryPatterns.some(pattern => pattern.test(trimmedLine));
+
+    // Keep all lines except obvious explanations
+    return !isObviousExplanation;
+  });
+
+  sanitized = filteredLines.join('\n');
+
+  // Clean up extra whitespace while preserving code indentation
+  sanitized = sanitized.replace(/\n\s*\n\s*\n/g, '\n\n');
+  sanitized = sanitized.trim();
+
+  return sanitized;
+}
\ No newline at end of file
diff --git a/libs/remix-ai-core/src/index.ts b/libs/remix-ai-core/src/index.ts
index 84be5a9019c..1a8a9693bdd 100644
--- a/libs/remix-ai-core/src/index.ts
+++ b/libs/remix-ai-core/src/index.ts
@@ -6,13 +6,18 @@ import { ModelType } from './types/constants'
 import { DefaultModels, InsertionParams, CompletionParams, GenerationParams, AssistantParams } from './types/models'
 import { buildChatPrompt } from './prompts/promptBuilder'
 import { RemoteInferencer } from './inferencers/remote/remoteInference'
+import { OllamaInferencer } from './inferencers/local/ollamaInferencer'
+import { isOllamaAvailable, getBestAvailableModel, listModels, discoverOllamaHost } from './inferencers/local/ollama'
+import { FIMModelManager, FIMModelConfig, FIM_MODEL_CONFIGS } from './inferencers/local/fimModelConfig'
 import { ChatHistory } from './prompts/chat'
 import { downloadLatestReleaseExecutable } from './helpers/inferenceServerReleases'
 import { ChatCommandParser } from './helpers/chatCommandParser'
 export {
   IModel, IModelResponse, ChatCommandParser,
   ModelType, DefaultModels, ICompletions, IParams, IRemoteModel, buildChatPrompt,
-  RemoteInferencer, InsertionParams, CompletionParams, GenerationParams, AssistantParams,
+  RemoteInferencer, OllamaInferencer, isOllamaAvailable, getBestAvailableModel, listModels, discoverOllamaHost,
+  FIMModelManager, FIMModelConfig, FIM_MODEL_CONFIGS,
+  InsertionParams, CompletionParams, GenerationParams, AssistantParams,
   ChatEntry, AIRequestType, ChatHistory, downloadLatestReleaseExecutable
 }
 
diff --git a/libs/remix-ai-core/src/inferencers/local/fimModelConfig.ts b/libs/remix-ai-core/src/inferencers/local/fimModelConfig.ts
new file mode 100644
index 00000000000..8613120d601
--- /dev/null
+++ b/libs/remix-ai-core/src/inferencers/local/fimModelConfig.ts
@@ -0,0 +1,106 @@
+export interface FIMTokens {
+  prefix: string;
+  suffix: string;
+  middle: string;
+}
+
+export interface FIMModelConfig {
+  name: string;
+  patterns: string[];
+  supportsNativeFIM: boolean; // Uses direct prompt/suffix parameters
+  fimTokens?: FIMTokens; // For token-based FIM models
+  description?: string;
+}
+
+// Comprehensive list of FIM-supported models
+export const FIM_MODEL_CONFIGS: FIMModelConfig[] = [
+  // Models with native FIM support (use prompt/suffix directly)
+  {
+    name: "Codestral",
+    patterns: ["codestral"],
+    supportsNativeFIM: true,
+    description: "Mistral's code model with native FIM support"
+  },
+  {
+    name: "starcoder",
+    patterns: ["starcoder"],
+    supportsNativeFIM: true,
+    description: "StarCoder models"
+  },
+
+  // Token-based FIM models
+  {
+    name: "DeepSeek Coder",
+    patterns: ["deepseek-coder", "deepseek"],
+    supportsNativeFIM: false,
+    fimTokens: {
+      prefix: "<｜fim▁begin｜>",
+      suffix: "<｜fim▁hole｜>",
+      middle: "<｜fim▁end｜>"
+    },
+    description: "DeepSeek's code model with FIM support"
+  }
+];
+
+export class FIMModelManager {
+  private static instance: FIMModelManager;
+  private modelConfigs: FIMModelConfig[];
+  private userSelectedModels: Set<string> = new Set();
+
+  private constructor() {
+    this.modelConfigs = [...FIM_MODEL_CONFIGS];
+  }
+
+  public static getInstance(): FIMModelManager {
+    if (!FIMModelManager.instance) {
+      FIMModelManager.instance = new FIMModelManager();
+    }
+    return FIMModelManager.instance;
+  }
+
+  public supportsFIM(modelName: string): boolean {
+    const config = this.findModelConfig(modelName);
+    if (!config) return false;
+
+    // Check if user has explicitly selected this model for FIM
+    return this.userSelectedModels.has(config.name) || this.isAutoDetected(modelName);
+  }
+
+  public usesNativeFIM(modelName: string): boolean {
+    const config = this.findModelConfig(modelName);
+    return config?.supportsNativeFIM || false;
+  }
+
+  public getFIMTokens(modelName: string): FIMTokens | null {
+    const config = this.findModelConfig(modelName);
+    return config?.fimTokens || null;
+  }
+
+  public buildFIMPrompt(prefix: string, suffix: string, modelName: string): string {
+    const tokens = this.getFIMTokens(modelName);
+    if (!tokens) {
+      throw new Error(`Model ${modelName} does not support token-based FIM`);
+    }
+
+    return `${tokens.prefix}${prefix}${tokens.suffix}${suffix}${tokens.middle}`;
+  }
+
+  private isAutoDetected(modelName: string): boolean {
+    const lowerModelName = modelName.toLowerCase();
+    const autoDetectPatterns = ['codestral', 'codellama', 'deepseek-coder'];
+
+    return autoDetectPatterns.some(pattern =>
+      lowerModelName.includes(pattern.toLowerCase())
+    );
+  }
+
+  private findModelConfig(modelName: string): FIMModelConfig | null {
+    const lowerModelName = modelName.toLowerCase();
+    return this.modelConfigs.find(config =>
+      config.patterns.some(pattern =>
+        lowerModelName.includes(pattern.toLowerCase())
+      )
+    ) || null;
+  }
+
+}
\ No newline at end of file
diff --git a/libs/remix-ai-core/src/inferencers/local/ollama.ts b/libs/remix-ai-core/src/inferencers/local/ollama.ts
index 39e564dc1c4..a5d48b3a1c1 100644
--- a/libs/remix-ai-core/src/inferencers/local/ollama.ts
+++ b/libs/remix-ai-core/src/inferencers/local/ollama.ts
@@ -1,39 +1,104 @@
 import axios from 'axios';
 
-const OLLAMA_HOST = 'http://localhost:11434';
+// default Ollama ports to check (11434 is the legacy/standard port)
+const OLLAMA_PORTS = [11434, 11435, 11436];
+const OLLAMA_BASE_HOST = 'http://localhost';
+
+let discoveredOllamaHost: string | null = null;
+
+export async function discoverOllamaHost(): Promise<string | null> {
+  if (discoveredOllamaHost) {
+    return discoveredOllamaHost;
+  }
+
+  for (const port of OLLAMA_PORTS) {
+    const host = `${OLLAMA_BASE_HOST}:${port}`;
+    try {
+      const res = await axios.get(`${host}/api/tags`, { timeout: 2000 });
+      if (res.status === 200) {
+        discoveredOllamaHost = host;
+        console.log(`Ollama discovered on ${host}`);
+        return host;
+      }
+    } catch (error) {
+      continue; // next port
+    }
+  }
+  return null;
+}
 
 export async function isOllamaAvailable(): Promise<boolean> {
+  const host = await discoverOllamaHost();
+  return host !== null;
+}
+
+export async function listModels(): Promise<string[]> {
+  const host = await discoverOllamaHost();
+  if (!host) {
+    throw new Error('Ollama is not available');
+  }
+
   try {
-    const res = await axios.get(`${OLLAMA_HOST}/api/tags`);
-    return res.status === 200;
+    const res = await axios.get(`${host}/api/tags`);
+    return res.data.models.map((model: any) => model.name);
+  } catch (error) {
+    throw new Error('Failed to list Ollama models');
+  }
+}
+
+export function getOllamaHost(): string | null {
+  return discoveredOllamaHost;
+}
+
+export function resetOllamaHost(): void {
+  discoveredOllamaHost = null;
+}
+
+export async function pullModel(modelName: string): Promise<void> {
+  // in case the user wants to pull a model from registry
+  const host = await discoverOllamaHost();
+  if (!host) {
+    throw new Error('Ollama is not available');
+  }
+
+  try {
+    await axios.post(`${host}/api/pull`, { name: modelName });
+    console.log(`Model ${modelName} pulled successfully`);
+  } catch (error) {
+    console.error('Error pulling model:', error);
+    throw new Error(`Failed to pull model: ${modelName}`);
+  }
+}
+
+export async function validateModel(modelName: string): Promise<boolean> {
+  try {
+    const models = await listModels();
+    return models.includes(modelName);
   } catch (error) {
     return false;
   }
 }
 
-export async function listModels(): Promise<string[]> {
-  const res = await axios.get(`${OLLAMA_HOST}/api/tags`);
-  return res.data.models.map((model: any) => model.name);
-}
-
-export async function setSystemPrompt(model: string, prompt: string): Promise<any> {
-  const payload = {
-    model,
-    system: prompt,
-    messages: [],
-  };
-  const res = await axios.post(`${OLLAMA_HOST}/api/chat`, payload);
-  return res.data;
-}
-
-export async function chatWithModel(model: string, systemPrompt: string, userMessage: string): Promise<string> {
-  const payload = {
-    model,
-    system: systemPrompt,
-    messages: [
-      { role: 'user', content: userMessage }
-    ],
-  };
-  const res = await axios.post(`${OLLAMA_HOST}/api/chat`, payload);
-  return res.data.message?.content || '[No response]';
+export async function getBestAvailableModel(): Promise<string | null> {
+  try {
+    const models = await listModels();
+    if (models.length === 0) return null;
+
+    // Prefer code-focused models for IDE
+    const codeModels = models.filter(m =>
+      m.includes('codellama') ||
+      m.includes('code') ||
+      m.includes('deepseek-coder') ||
+      m.includes('starcoder')
+    );
+
+    if (codeModels.length > 0) {
+      return codeModels[0];
+    }
+    // TODO get model stats and get best model
+    return models[0];
+  } catch (error) {
+    console.error('Error getting best available model:', error);
+    return null;
+  }
 }
diff --git a/libs/remix-ai-core/src/inferencers/local/ollamaInferencer.ts b/libs/remix-ai-core/src/inferencers/local/ollamaInferencer.ts
index 8ccd40c3b81..c610a1f4ec1 100644
--- a/libs/remix-ai-core/src/inferencers/local/ollamaInferencer.ts
+++ b/libs/remix-ai-core/src/inferencers/local/ollamaInferencer.ts
@@ -1,126 +1,475 @@
 import { AIRequestType, ICompletions, IGeneration, IParams } from "../../types/types";
 import { CompletionParams, GenerationParams } from "../../types/models";
-import EventEmitter from "events";
-import { ChatHistory } from "../../prompts/chat";
-import { isOllamaAvailable } from "./ollama";
+import { discoverOllamaHost, listModels } from "./ollama";
+import { HandleOllamaResponse } from "../../helpers/streamHandler";
+import { sanitizeCompletionText } from "../../helpers/textSanitizer";
+import { FIMModelManager } from "./fimModelConfig";
+import {
+  CONTRACT_PROMPT,
+  WORKSPACE_PROMPT,
+  CHAT_PROMPT,
+  CODE_COMPLETION_PROMPT,
+  CODE_INSERTION_PROMPT,
+  CODE_GENERATION_PROMPT,
+  CODE_EXPLANATION_PROMPT,
+  ERROR_EXPLANATION_PROMPT,
+  SECURITY_ANALYSIS_PROMPT
+} from "./systemPrompts";
 import axios from "axios";
 import { RemoteInferencer } from "../remote/remoteInference";
 
 const defaultErrorMessage = `Unable to get a response from Ollama server`;
 
-export class OllamaInferencer extends RemoteInferencer implements ICompletions {
-  ollama_api_url: string = "http://localhost:11434/api/generate";
+export class OllamaInferencer extends RemoteInferencer implements ICompletions, IGeneration {
+  private ollama_host: string | null = null;
   model_name: string = "llama2:13b"; // Default model
+  private isInitialized: boolean = false;
+  private modelSupportsInsert: boolean | null = null;
+  private currentSuffix: string = "";
+  private fimManager: FIMModelManager;
 
   constructor(modelName?: string) {
     super();
-    this.api_url = this.ollama_api_url;
     this.model_name = modelName || this.model_name;
+    this.fimManager = FIMModelManager.getInstance();
+    this.initialize();
   }
 
-  override async _makeRequest(payload: any, rType:AIRequestType): Promise<string> {
+  private async initialize(): Promise<void> {
+    if (this.isInitialized) return;
+
+    this.ollama_host = await discoverOllamaHost();
+    if (!this.ollama_host) {
+      throw new Error('Ollama is not available on any of the default ports');
+    }
+
+    // Default to generate endpoint, will be overridden per request type
+    this.api_url = `${this.ollama_host}/api/generate`;
+    this.isInitialized = true;
+
+    try {
+      const availableModels = await listModels();
+      if (availableModels.length > 0 && !availableModels.includes(this.model_name)) {
+        this.model_name = availableModels[0];
+        console.log(`Auto-selected model: ${this.model_name}`);
+      }
+    } catch (error) {
+      console.warn('Could not auto-select model. Make sure you have at least one model installed:', error);
+    }
+  }
+
+  private getEndpointForRequestType(rType: AIRequestType): string {
+    switch (rType) {
+    case AIRequestType.COMPLETION:
+      return `${this.ollama_host}/api/generate`;
+    case AIRequestType.GENERAL:
+      return `${this.ollama_host}/api/chat`;
+    default:
+      return `${this.ollama_host}/api/generate`;
+    }
+  }
+
+  private removeSuffixOverlap(completion: string, suffix: string): string {
+    if (!suffix || !completion) return completion;
+
+    const trimmedCompletion = completion.trimEnd();
+    const trimmedSuffix = suffix.trimStart();
+
+    if (!trimmedCompletion || !trimmedSuffix) return completion;
+
+    // Helper function to normalize whitespace for comparison
+    const normalizeWhitespace = (str: string): string => {
+      return str.replace(/\s+/g, ' ').trim();
+    };
+
+    // Helper function to find whitespace-flexible overlap
+    const findFlexibleOverlap = (compEnd: string, suffStart: string): number => {
+      const normalizedCompEnd = normalizeWhitespace(compEnd);
+      const normalizedSuffStart = normalizeWhitespace(suffStart);
+
+      if (normalizedCompEnd === normalizedSuffStart) {
+        return compEnd.length;
+      }
+      return 0;
+    };
+
+    let bestOverlapLength = 0;
+    let bestOriginalLength = 0;
+
+    // Start from longer overlaps for better performance (early exit on first match)
+    const maxOverlap = Math.min(trimmedCompletion.length, trimmedSuffix.length);
+
+    // Limit search to reasonable overlap lengths for performance
+    const searchLimit = Math.min(maxOverlap, 50);
+
+    for (let i = searchLimit; i >= 1; i--) {
+      const completionEnd = trimmedCompletion.slice(-i);
+      const suffixStart = trimmedSuffix.slice(0, i);
+
+      // First try exact match for performance
+      if (completionEnd === suffixStart) {
+        bestOverlapLength = i;
+        bestOriginalLength = i;
+        break;
+      }
+
+      // Then try whitespace-flexible match
+      const flexibleOverlap = findFlexibleOverlap(completionEnd, suffixStart);
+      if (flexibleOverlap > 0 && flexibleOverlap > bestOriginalLength) {
+        bestOverlapLength = flexibleOverlap;
+        bestOriginalLength = flexibleOverlap;
+        break;
+      }
+    }
+
+    // Also check for partial semantic overlaps (like "){"  matching " ) { ")
+    if (bestOverlapLength === 0) {
+      // Extract significant characters (non-whitespace) from end of completion
+      const significantCharsRegex = /[^\s]+[\s]*$/;
+      const compMatch = trimmedCompletion.match(significantCharsRegex);
+
+      if (compMatch) {
+        const significantEnd = compMatch[0];
+        const normalizedSignificant = normalizeWhitespace(significantEnd);
+
+        // Check if this appears at the start of suffix (with flexible whitespace)
+        for (let i = 1; i <= Math.min(significantEnd.length + 10, trimmedSuffix.length); i++) {
+          const suffixStart = trimmedSuffix.slice(0, i);
+          const normalizedSuffStart = normalizeWhitespace(suffixStart);
+
+          if (normalizedSignificant === normalizedSuffStart) {
+            bestOverlapLength = significantEnd.length;
+            console.log(`Found semantic overlap: "${significantEnd}" matches "${suffixStart}"`);
+            break;
+          }
+        }
+      }
+    }
+
+    // Remove the overlapping part from the completion
+    if (bestOverlapLength > 0) {
+      const result = trimmedCompletion.slice(0, -bestOverlapLength);
+      console.log(`Removed ${bestOverlapLength} overlapping characters from completion`);
+      return result;
+    }
+
+    return completion;
+  }
+
+  private async checkModelInsertSupport(): Promise<boolean> {
+    try {
+      const response = await axios.post(`${this.ollama_host}/api/show`, {
+        name: this.model_name
+      });
+
+      if (response.status === 200 && response.data) {
+        // Check if the model template or parameters indicate insert support
+        const modelInfo = response.data;
+        const template = modelInfo.template || '';
+        const parameters = modelInfo.parameters || {};
+        console.log('model parameters', parameters)
+        console.log('model template', template)
+
+        // Look for FIM/insert indicators in the template or model info
+        const hasInsertSupport = template.includes('fim') ||
+                                template.includes('suffix') ||
+                                template.includes('<fim_') ||
+                                template.includes('<|fim_') ||
+                                template.includes('.Suffix') ||
+                                parameters.stop?.includes('<fim_middle>')
+
+        console.log(`Model ${this.model_name} insert support:`, hasInsertSupport);
+        return hasInsertSupport;
+      }
+    } catch (error) {
+      console.warn(`Failed to check model insert support: ${error}`);
+    }
+    return false;
+  }
+
+  private buildOllamaOptions(payload: any) {
+    const options: any = {};
+
+    if (payload.max_tokens || payload.max_new_tokens) options.num_predict = payload.max_tokens || payload.max_new_tokens;
+
+    if (payload.stop) options.stop = Array.isArray(payload.stop) ? payload.stop : [payload.stop];
+
+    if (payload.temperature !== undefined) options.temperature = payload.temperature;
+    if (payload.top_p !== undefined) options.top_p = payload.top_p;
+
+    if (payload.top_k !== undefined) options.top_k = payload.top_k;
+
+    if (payload.repeat_penalty !== undefined) options.repeat_penalty = payload.repeat_penalty;
+
+    if (payload.seed !== undefined) options.seed = payload.seed;
+    return Object.keys(options).length > 0 ? options : undefined;
+  }
+
+  override async _makeRequest(payload: any, rType: AIRequestType): Promise<string> {
     this.event.emit("onInference");
-    payload['stream'] = false;
-    payload['model'] = this.model_name;
-    console.log("calling _makeRequest Ollama API URL:", this.api_url);
+
+    const endpoint = this.getEndpointForRequestType(rType);
+    const options = this.buildOllamaOptions(payload);
+    let requestPayload = payload
+
+    if (rType === AIRequestType.COMPLETION) {
+      // Use /api/generate for completion requests
+      if (options) {
+        requestPayload.options = options;
+      }
+    } else {
+      // Use /api/chat for general requests
+      requestPayload = {
+        model: this.model_name,
+        messages: payload.messages || [{ role: "user", content: payload.prompt || "" }],
+        stream: false,
+        system: payload.system
+      };
+      if (options) requestPayload.options = options;
+    }
+
     try {
-      const result = await axios.post(this.api_url, payload, {
+      const result = await axios.post(endpoint, requestPayload, {
         headers: { "Content-Type": "application/json" },
       });
 
       if (result.status === 200) {
-        const text = result.data.message?.content || "";
-        return text;
+        let text = "";
+        if (rType === AIRequestType.COMPLETION) {
+          console.log('text before processing', result.data.response)
+          const rawResponse = result.data.response || "";
+
+          // Skip sanitization for any FIM-capable models (user-selected or API-detected)
+          const userSelectedFIM = this.fimManager.supportsFIM(this.model_name);
+          const hasAnyFIM = userSelectedFIM || this.modelSupportsInsert;
+
+          if (hasAnyFIM) {
+            console.log('Skipping sanitization for FIM-capable model')
+            text = rawResponse;
+          } else {
+            text = sanitizeCompletionText(rawResponse);
+            console.log('text after sanitization', text)
+          }
+        } else {
+          text = result.data.message?.content || "";
+        }
+        return text.trimStart();
       } else {
         return defaultErrorMessage;
       }
     } catch (e: any) {
-      console.error("Error making Ollama request:", e.message);
       return defaultErrorMessage;
     } finally {
       this.event.emit("onInferenceDone");
     }
   }
 
-  override async _streamInferenceRequest(payload: any, rType:AIRequestType) {
+  override async _streamInferenceRequest(payload: any, rType: AIRequestType) {
     this.event.emit("onInference");
-    payload['model'] = this.model_name;
-    console.log("payload in stream request", payload);
-    console.log("calling _streammakeRequest Ollama API URL:", this.api_url);
-
-    const response = await fetch(this.api_url, {
-      method: "POST",
-      headers: { "Content-Type": "application/json" },
-      body: JSON.stringify({
+
+    const endpoint = this.getEndpointForRequestType(rType);
+    const options = this.buildOllamaOptions(payload);
+    let streamPayload: any;
+
+    if (rType === AIRequestType.COMPLETION) {
+      // Use /api/generate for completion requests
+      streamPayload = {
+        model: this.model_name,
+        prompt: payload.prompt || payload.messages?.[0]?.content || "",
         stream: true,
+        system: payload.system || CODE_COMPLETION_PROMPT
+      };
+      if (options) {
+        streamPayload.options = options;
+      }
+    } else {
+      // Use /api/chat for general requests
+      streamPayload = {
         model: this.model_name,
-        messages: [{ role: "user", content: payload.prompt }],
-      }),
-    });
+        messages: payload.messages || [{ role: "user", content: payload.prompt || "" }],
+        stream: true,
+        system: payload.system
+      };
+      if (options) {
+        streamPayload.options = options;
+      }
+    }
 
-    console.log("response in stream request", response);
-    // if (payload.return_stream_response) {
-    //   return response
-    // }
+    try {
+      const response = await fetch(endpoint, {
+        method: "POST",
+        headers: { "Content-Type": "application/json" },
+        body: JSON.stringify(streamPayload),
+      });
 
-    const reader = response.body?.getReader();
-    const decoder = new TextDecoder();
+      if (!response.ok) {
+        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
+      }
+      if (payload.return_stream_response) {
+        return response
+      }
 
-    let resultText = "";
+      // Use the centralized Ollama stream handler
+      let resultText = "";
+      await HandleOllamaResponse(
+        response,
+        (chunk: string) => {
+          resultText += chunk;
+          this.event.emit("onStreamResult", chunk);
+        },
+        (finalText: string) => {
+          resultText = finalText;
+        }
+      );
 
-    try {
-      // eslint-disable-next-line no-constant-condition
-      while (true) {
-        const { done, value } = await reader.read();
-        if (done) break;
-
-        const chunk = decoder.decode(value, { stream: true });
-        console.log("chunk", chunk);
-        resultText += chunk;
-        this.event.emit("onStreamResult", chunk);
-      }
       return resultText;
     } catch (e: any) {
-      console.error("Streaming error from Ollama:", e.message);
       return defaultErrorMessage;
     } finally {
       this.event.emit("onInferenceDone");
     }
   }
 
-  private _buildPayload(prompt: string, system?: string) {
+  private _buildPayload(prompt: string, payload: any, system?: string) {
     return {
       model: this.model_name,
-      system: system || "You are a helpful assistant.",
-      messages: [{ role: "user", content: prompt }],
+      system: system || CHAT_PROMPT,
+      messages: [{ role: "user", content: prompt }, { role:"assistant", content:system }],
+      ...payload
     };
   }
 
-  // async code_completion(context: any, ctxFiles: any, fileName: any, options: IParams = CompletionParams) {
-  // }
+  async buildCompletionPrompt(prfx:string, srfx:string) {
+    const prompt = prfx + '<CURSOR>' + srfx;
+    return `Complete the code at the <CURSOR> position. Provide only the code that should be inserted at the cursor position, without any explanations or markdown formatting:\n\n${prompt}`;
+  }
+
+  async code_completion(prompt: string, promptAfter: string, ctxFiles: any, fileName: any, options: IParams = CompletionParams): Promise<any> {
+    console.log("Code completion called")
+
+    // Store the suffix for overlap removal
+    this.currentSuffix = promptAfter || "";
+
+    let payload: any;
+
+    // Check FIM support: user selection first, then API detection for native FIM
+    const userSelectedFIM = this.fimManager.supportsFIM(this.model_name);
+
+    // Check API for native FIM support if not user-selected
+    if (!userSelectedFIM && this.modelSupportsInsert === null) {
+      this.modelSupportsInsert = await this.checkModelInsertSupport();
+    }
+
+    const hasNativeFIM = userSelectedFIM ? this.fimManager.usesNativeFIM(this.model_name) : this.modelSupportsInsert;
+    const hasTokenFIM = userSelectedFIM && !this.fimManager.usesNativeFIM(this.model_name);
+
+    console.log("modelSupportsInsert;", this.modelSupportsInsert)
+    console.log("usesNativeFim;", this.fimManager.usesNativeFIM(this.model_name) )
+    if (hasNativeFIM) {
+      // Native FIM support (prompt/suffix parameters)
+      payload = {
+        model: this.model_name,
+        prompt: prompt,
+        suffix: promptAfter,
+        stream: false,
+        ...options
+      };
+      console.log('using native FIM params', payload);
+    } else if (hasTokenFIM) {
+      // Token-based FIM support
+      const fimPrompt = this.fimManager.buildFIMPrompt(prompt, promptAfter, this.model_name);
+      payload = {
+        model: this.model_name,
+        prompt: fimPrompt,
+        stream: false,
+        ...options
+      };
+      console.log('using token FIM params', payload);
+    } else {
+      // No FIM support, use completion prompt
+      console.log(`Model ${this.model_name} does not support FIM, using completion prompt`);
+      const completionPrompt = await this.buildCompletionPrompt(prompt, promptAfter);
+      payload = this._buildPayload(completionPrompt, options, CODE_COMPLETION_PROMPT);
+    }
+
+    const result = await this._makeRequest(payload, AIRequestType.COMPLETION);
+
+    // Apply suffix overlap removal if we have both result and suffix
+    if (result && this.currentSuffix) {
+      return this.removeSuffixOverlap(result, this.currentSuffix);
+    }
+
+    return result;
+  }
+
+  async code_insertion(msg_pfx: string, msg_sfx: string, ctxFiles: any, fileName: any, options: IParams = GenerationParams): Promise<any> {
+    console.log("Code insertion called")
+    // Delegate to code_completion which already handles suffix overlap removal
+    return await this.code_completion(msg_pfx, msg_sfx, ctxFiles, fileName, options);
+  }
 
-  // async code_insertion(prompt: string, options: IParams = GenerationParams) {
-  // }
+  async code_generation(prompt: string, options: IParams = GenerationParams): Promise<any> {
+    const payload = this._buildPayload(prompt, options, CODE_GENERATION_PROMPT);
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 
-  // async code_generation(prompt: string, options: IParams = GenerationParams) {
-  // }
+  async generate(userPrompt: string, options: IParams = GenerationParams): Promise<any> {
+    const payload = this._buildPayload(userPrompt, options, CONTRACT_PROMPT);
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 
-  // async generate(userPrompt: string, options: IParams = GenerationParams): Promise<any> {
-  // }
+  async generateWorkspace(prompt: string, options: IParams = GenerationParams): Promise<any> {
+    const payload = this._buildPayload(prompt, options, WORKSPACE_PROMPT);
 
-  // async generateWorkspace(prompt: string, options: IParams = GenerationParams): Promise<any> {
-  // }
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 
-  // async answer(prompt: string, options: IParams = GenerationParams): Promise<any> {
-  // }
+  async answer(prompt: string, options: IParams = GenerationParams): Promise<any> {
+    const payload = this._buildPayload(prompt, options, CHAT_PROMPT);
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 
-  // async code_explaining(prompt, context:string="", options:IParams=GenerationParams): Promise<any> {
-  // }
+  async code_explaining(prompt: string, context: string = "", options: IParams = GenerationParams): Promise<any> {
+    const payload = this._buildPayload(prompt, options, CODE_EXPLANATION_PROMPT);
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 
-  // async error_explaining(prompt, options:IParams=GenerationParams): Promise<any> {
+  async error_explaining(prompt: string, options: IParams = GenerationParams): Promise<any> {
 
-  // }
+    const payload = this._buildPayload(prompt, options, ERROR_EXPLANATION_PROMPT);
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 
-  // async vulnerability_check(prompt: string, options: IParams = GenerationParams): Promise<any> {
-  // }
+  async vulnerability_check(prompt: string, options: IParams = GenerationParams): Promise<any> {
+    const payload = this._buildPayload(prompt, options, SECURITY_ANALYSIS_PROMPT);
+    if (options.stream_result) {
+      return await this._streamInferenceRequest(payload, AIRequestType.GENERAL);
+    } else {
+      return await this._makeRequest(payload, AIRequestType.GENERAL);
+    }
+  }
 }
diff --git a/libs/remix-ai-core/src/inferencers/local/systemPrompts.ts b/libs/remix-ai-core/src/inferencers/local/systemPrompts.ts
new file mode 100644
index 00000000000..e325ebc2193
--- /dev/null
+++ b/libs/remix-ai-core/src/inferencers/local/systemPrompts.ts
@@ -0,0 +1,61 @@
+export const CONTRACT_PROMPT = `You are a Web3 developer. Generate a Web3 project, specify the GitHub tag in the library import path if existent and return only a JSON object with the following structure:
+
+{
+  "projectName": "<adequate project name>",
+  "files": [
+    {
+      "fileName": "<file path>",
+      "content": "<code>"
+    }
+  ]
+}
+Requirements:
+Project Naming: Provide a meaningful and concise project name that reflects the purpose of the smart contract(s) or scripts. Each contract source file must have a SPDX license identifier MIT. Make sure the imports are relative to the directory names. Do not use truffle as test library. Use mocha/chai unit tests in typescript. Make sure the json format is respected by ensuring double-quoted property name and omit the unnecessary comma ins the json format. Do not use any local import references. If applicable only use openzeppelin library version 5 onwards for smart contract and generate contracts with adequate compiler version greater or equal that 0.8.20
+
+The primary language for smart contract is Solidity and for script Javascript or typescript, except the user request a specific language.
+
+Folder Structure:
+Test files should be placed in a tests/ folder.
+Additional necessary configurations (if required) should be placed in appropriate folders (e.g., scripts/, config/).
+Code Requirements:
+The content field must contain only valid code, with no additional comments, formatting, or explanations.
+Ensure the code is syntactically correct and follows best practices for code development.
+Use proper contract structuring, access control, and error handling.
+Minimize File Count: Keep the number of files minimal while maintaining a clean and functional structure.
+Use Latest Libraries: If external libraries (e.g., OpenZeppelin) are relevant, include them and ensure they are up-to-date.
+Use \`@+libname\` for imports. e.g. for importing openzeppelin library use \`@openzeppelin\`
+Internet Search: If necessary, search the internet for the latest libraries, best practices, and security recommendations before finalizing the code.
+
+Output Example:
+For a simple ERC-20 token contract, the JSON output might look like this:
+
+{
+  "projectName": "MyToken",
+  "files": [
+    {
+      "fileName": "contracts/MyToken.sol",
+      "content": "// SPDX-License-Identifier: MIT\\npragma solidity ^0.8.0; ... (contract code) ..."
+    },
+    {
+      "fileName": "tests/MyTokenTest.ts",
+      "content": "// SPDX-License-Identifier: MIT\\n pragma solidity ^0.8.0;\\n import \\"../contracts/MyToken.sol\\";... (test code) ..."
+    } 
+  ]
+}`;
+
+export const WORKSPACE_PROMPT = "You are a coding assistant with full access to the user's project workspace.\nWhen the user provides a prompt describing a desired change or feature, follow these steps:\nAnalyze the Prompt: Understand the user's intent, including what functionality or change is required.\nInspect the Codebase: Review the relevant parts of the workspace to identify which files are related to the requested change.\nDetermine Affected Files: Decide which files need to be modified or created.\nGenerate Full Modified Files: For each affected file, return the entire updated file content, not just the diff or patch.\n\nOutput format\n {\n    \"files\": [\n    {\n      \"fileName\": \"<file path>\",\n      \"content\": \"FULL CONTENT OF THE MODIFIED FILE HERE\"\n   }\n   ]\n  }\nOnly include files that need to be modified or created. Do not include files that are unchanged.\nBe precise, complete, and maintain formatting and coding conventions consistent with the rest of the project.\nIf the change spans multiple files, ensure that all related parts are synchronized.\n"
+
+export const CHAT_PROMPT = "You are a Web3 AI assistant integrated into the Remix IDE named RemixAI. Your primary role is to help developers write, understand, debug, and optimize smart contracts and other related Web3 code. You must provide secure, gas-efficient, and up-to-date advice. Be concise and accurate, especially when dealing with smart contract vulnerabilities, compiler versions, and Ethereum development best practices.\nYour capabilities include:\nExplaining Major web3 programming (solidity, noir, circom, Vyper) syntax, security issues (e.g., reentrancy, underflow/overflow), and design patterns.\nReviewing and improving smart contracts for gas efficiency, security, and readability.\nHelping with Remix plugins, compiler settings, and deployment via the Remix IDE interface.\nExplaining interactions with web3.js, ethers.js, Hardhat, Foundry, OpenZeppelin, etc., if needed.\nWriting and explaining unit tests, especially in JavaScript/typescript or Solidity.\nRules:\nPrioritize secure coding and modern Solidity (e.g., ^0.8.x).\nNever give advice that could result in loss of funds (e.g., suggest unguarded delegatecall).\nIf unsure about a version-specific feature or behavior, clearly state the assumption.\nDefault to using best practices (e.g., require, SafeERC20, OpenZeppelin libraries).\nBe helpful but avoid speculative or misleading answers — if a user asks for something unsafe, clearly warn them.\nIf a user shares code, analyze it carefully and suggest improvements with reasoning. If they ask for a snippet, return a complete, copy-pastable example formatted in Markdown code blocks."
+
+// Additional system prompts for specific use cases
+export const CODE_COMPLETION_PROMPT = "You are a code completion assistant. Complete the code provided, focusing on the immediate next lines needed. Provide only the code that should be added, without explanations or comments unless they are part of the code itself. Do not return ``` for signalising code."
+
+export const CODE_INSERTION_PROMPT = "You are a code completion assistant. Fill in the missing code between the given prefix and suffix. Ensure the code fits naturally and maintains proper syntax and formatting."
+
+export const CODE_GENERATION_PROMPT = "You are a code generation assistant. Generate clean, well-documented code based on the user's requirements. Follow best practices and include necessary imports, error handling, and comments where appropriate."
+
+export const CODE_EXPLANATION_PROMPT = "You are a code explanation assistant. Provide clear, educational explanations of code functionality and concepts. Break down complex code into understandable parts and explain the logic, patterns, and best practices used."
+
+export const ERROR_EXPLANATION_PROMPT = "You are a debugging assistant. Help explain errors and provide practical solutions. Focus on what the error means, common causes, step-by-step solutions, and prevention tips."
+
+export const SECURITY_ANALYSIS_PROMPT = "You are a security analysis assistant. Identify vulnerabilities and provide security recommendations for code. Check for common security issues, best practice violations, potential attack vectors, and provide detailed recommendations for fixes."
diff --git a/libs/remix-ui/remix-ai-assistant/src/components/prompt.tsx b/libs/remix-ui/remix-ai-assistant/src/components/prompt.tsx
index 3e7ab88c7f4..eb9ee544ee3 100644
--- a/libs/remix-ui/remix-ai-assistant/src/components/prompt.tsx
+++ b/libs/remix-ui/remix-ai-assistant/src/components/prompt.tsx
@@ -15,18 +15,25 @@ export interface PromptAreaProps {
   setShowContextOptions: React.Dispatch<React.SetStateAction<boolean>>
   showAssistantOptions: boolean
   setShowAssistantOptions: React.Dispatch<React.SetStateAction<boolean>>
+  showModelOptions: boolean
+  setShowModelOptions: React.Dispatch<React.SetStateAction<boolean>>
   contextChoice: AiContextType
   setContextChoice: React.Dispatch<React.SetStateAction<AiContextType>>
   assistantChoice: AiAssistantType
   setAssistantChoice: React.Dispatch<React.SetStateAction<AiAssistantType>>
+  availableModels: string[]
+  selectedModel: string | null
   contextFiles: string[]
   clearContext: () => void
   handleAddContext: () => void
   handleSetAssistant: () => void
+  handleSetModel: () => void
+  handleModelSelection: (modelName: string) => void
   handleGenerateWorkspace: () => void
   dispatchActivity: (type: ActivityType, payload?: any) => void
   contextBtnRef: React.RefObject<HTMLButtonElement>
   modelBtnRef: React.RefObject<HTMLButtonElement>
+  modelSelectorBtnRef: React.RefObject<HTMLButtonElement>
   aiContextGroupList: groupListType[]
   aiAssistantGroupList: groupListType[]
   textareaRef?: React.RefObject<HTMLTextAreaElement>
@@ -44,18 +51,25 @@ export const PromptArea: React.FC<PromptAreaProps> = ({
   setShowContextOptions,
   showAssistantOptions,
   setShowAssistantOptions,
+  showModelOptions,
+  setShowModelOptions,
   contextChoice,
   setContextChoice,
   assistantChoice,
   setAssistantChoice,
+  availableModels,
+  selectedModel,
   contextFiles,
   clearContext,
   handleAddContext,
   handleSetAssistant,
+  handleSetModel,
+  handleModelSelection,
   handleGenerateWorkspace,
   dispatchActivity,
   contextBtnRef,
   modelBtnRef,
+  modelSelectorBtnRef,
   aiContextGroupList,
   aiAssistantGroupList,
   textareaRef,
@@ -135,18 +149,34 @@ export const PromptArea: React.FC<PromptAreaProps> = ({
           />
 
           <div className="d-flex justify-content-between">
-            <button
-              onClick={handleSetAssistant}
-              className="btn btn-text btn-sm small fw-light text-secondary mt-2 align-self-end border border-text rounded"
-              ref={modelBtnRef}
-            >
-              {assistantChoice === null && 'Default'}
-              {assistantChoice === 'openai' && ' OpenAI'}
-              {assistantChoice === 'mistralai' && ' MistralAI'}
-              {assistantChoice === 'anthropic' && ' Anthropic'}
-              {'  '}
-              <span className={showAssistantOptions ? "fa fa-caret-up" : "fa fa-caret-down"}></span>
-            </button>
+
+            <div className="d-flex">
+              <button
+                onClick={handleSetAssistant}
+                className="btn btn-text btn-sm small font-weight-light text-secondary mt-2 align-self-end border border-text rounded"
+                ref={modelBtnRef}
+              >
+                {assistantChoice === null && 'Default'}
+                {assistantChoice === 'openai' && ' OpenAI'}
+                {assistantChoice === 'mistralai' && ' MistralAI'}
+                {assistantChoice === 'anthropic' && ' Anthropic'}
+                {assistantChoice === 'ollama' && ' Ollama'}
+                {'  '}
+                <span className={showAssistantOptions ? "fa fa-caret-up" : "fa fa-caret-down"}></span>
+              </button>
+              {assistantChoice === 'ollama' && availableModels.length > 0 && (
+                <button
+                  onClick={handleSetModel}
+                  className="btn btn-text btn-sm small font-weight-light text-secondary mt-2 align-self-end border border-text rounded ms-2"
+                  ref={modelSelectorBtnRef}
+                  data-id="ollama-model-selector"
+                >
+                  {selectedModel || 'Select Model'}
+                  {'  '}
+                  <span className={showModelOptions ? "fa fa-caret-up" : "fa fa-caret-down"}></span>
+                </button>
+              )}
+            </div>
             <button
               data-id="remix-ai-workspace-generate"
               className="btn btn-text btn-sm small fw-light text-secondary mt-2 align-self-end border border-text rounded"
@@ -204,6 +234,9 @@ function TooltipContent () {
       <li className="">
         {'- Alternatively, you may type your question directly below.'}
       </li>
+      <li className="">
+        - {'-[Ollama Setup Guide](https://github.com/ethereum/remix-project/blob/master/OLLAMA_SETUP.md)'}
+      </li>
     </ul>
   )
 }
diff --git a/libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx b/libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx
index f67cab6b606..38be6922723 100644
--- a/libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx
+++ b/libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx
@@ -1,8 +1,8 @@
 import React, { useState, useEffect, useCallback, useRef, useImperativeHandle, MutableRefObject } from 'react'
 import '../css/remix-ai-assistant.css'
 
-import { ChatCommandParser, GenerationParams, ChatHistory, HandleStreamResponse } from '@remix/remix-ai-core'
-import { HandleOpenAIResponse, HandleMistralAIResponse, HandleAnthropicResponse } from '@remix/remix-ai-core'
+import { ChatCommandParser, GenerationParams, ChatHistory, HandleStreamResponse, listModels, isOllamaAvailable } from '@remix/remix-ai-core'
+import { HandleOpenAIResponse, HandleMistralAIResponse, HandleAnthropicResponse, HandleOllamaResponse } from '@remix/remix-ai-core'
 import '../css/color.css'
 import { Plugin } from '@remixproject/engine'
 import { ModalTypes } from '@remix-ui/app'
@@ -41,21 +41,26 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
   const [isStreaming, setIsStreaming] = useState(false)
   const [showContextOptions, setShowContextOptions] = useState(false)
   const [showAssistantOptions, setShowAssistantOptions] = useState(false)
-  const [assistantChoice, setAssistantChoice] = useState<'openai' | 'mistralai' | 'anthropic'>(
+  const [showModelOptions, setShowModelOptions] = useState(false)
+  const [assistantChoice, setAssistantChoice] = useState<'openai' | 'mistralai' | 'anthropic' | 'ollama'>(
     'mistralai'
   )
   const [contextChoice, setContextChoice] = useState<'none' | 'current' | 'opened' | 'workspace'>(
     'none'
   )
+  const [availableModels, setAvailableModels] = useState<string[]>([])
+  const [selectedModel, setSelectedModel] = useState<string | null>(null)
 
   const historyRef = useRef<HTMLDivElement | null>(null)
   const modelBtnRef = useRef(null)
+  const modelSelectorBtnRef = useRef(null)
   const contextBtnRef = useRef(null)
   const textareaRef = useRef<HTMLTextAreaElement>(null)
   const aiChatRef = useRef<HTMLDivElement>(null)
 
   useOnClickOutside([modelBtnRef, contextBtnRef], () => setShowAssistantOptions(false))
   useOnClickOutside([modelBtnRef, contextBtnRef], () => setShowContextOptions(false))
+  useOnClickOutside([modelSelectorBtnRef], () => setShowModelOptions(false))
 
   const getBoundingRect = (ref: MutableRefObject<any>) => ref.current?.getBoundingClientRect()
   const calcAndConvertToDvh = (coordValue: number) => (coordValue / window.innerHeight) * 100
@@ -114,6 +119,13 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
       icon: 'fa-solid fa-check',
       stateValue: 'anthropic',
       dataId: 'composer-ai-assistant-anthropic'
+    },
+    {
+      label: 'Ollama',
+      bodyText: 'Local AI models running on your machine (requires Ollama installation)',
+      icon: 'fa-solid fa-check',
+      stateValue: 'ollama',
+      dataId: 'composer-ai-assistant-ollama'
     }
   ]
 
@@ -351,6 +363,16 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
           )
           // Add Anthropic handler here if available
           break;
+        case 'ollama':
+          HandleOllamaResponse(
+            response,
+            (chunk: string) => appendAssistantChunk(assistantId, chunk),
+            (finalText: string) => {
+              //ChatHistory.pushHistory(trimmed, finalText) -> handled by ollama
+              setIsStreaming(false)
+            }
+          )
+          break;
         default:
           HandleStreamResponse(
             response,
@@ -415,6 +437,89 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
     fetchAssistantChoice()
   }, [assistantChoice])
 
+  // Fetch available models everytime Ollama is selected
+  useEffect(() => {
+    const fetchModels = async () => {
+      if (assistantChoice === 'ollama') {
+        try {
+          const available = await isOllamaAvailable()
+          if (available) {
+            const models = await listModels()
+            setAvailableModels(models)
+            if (models.length === 0) {
+              // Ollama is running but no models installed
+              setMessages(prev => [...prev, {
+                id: crypto.randomUUID(),
+                role: 'assistant',
+                content: '**Ollama is running but no models are installed.**\n\nTo use Ollama, you need to install at least one model. Try:\n\n```bash\nollama pull codestral:latest\n# or\nollama pull qwen2.5-coder:14b\n```\n\nSee the [Ollama Setup Guide](https://github.com/ethereum/remix-project/blob/master/OLLAMA_SETUP.md) for more information.',
+                timestamp: Date.now(),
+                sentiment: 'none'
+              }])
+            } else {
+              if (!selectedModel && models.length > 0) {
+                const defaultModel = models.find(m => m.includes('codellama') || m.includes('code')) || models[0]
+                setSelectedModel(defaultModel)
+              }
+              // Show success message when Ollama is available
+              setMessages(prev => [...prev, {
+                id: crypto.randomUUID(),
+                role: 'assistant',
+                content: `**Ollama connected successfully!**\n\nFound ${models.length} model${models.length > 1 ? 's' : ''}:\n${models.map(m => `• ${m}`).join('\n')}\n\nYou can now use local AI for code completion and assistance.`,
+                timestamp: Date.now(),
+                sentiment: 'none'
+              }])
+            }
+          } else {
+            // Ollama is not available
+            setAvailableModels([])
+            setMessages(prev => [...prev, {
+              id: crypto.randomUUID(),
+              role: 'assistant',
+              content: '**Ollama is not available.**\n\nTo use Ollama with Remix IDE:\n\n1. **Install Ollama**: Visit [ollama.ai](https://ollama.ai) to download\n2. **Start Ollama**: Run `ollama serve` in your terminal\n3. **Install a model**: Run `ollama pull codestral:latest`\n4. **Configure CORS**: Set `OLLAMA_ORIGINS=https://remix.ethereum.org`\n\nSee the [Ollama Setup Guide](https://github.com/ethereum/remix-project/blob/master/OLLAMA_SETUP.md) for detailed instructions.\n\n*Switching back to previous model for now.*',
+              timestamp: Date.now(),
+              sentiment: 'none'
+            }])
+            // Automatically switch back to mistralai
+            setAssistantChoice('mistralai')
+          }
+        } catch (error) {
+          console.warn('Failed to fetch Ollama models:', error)
+          setAvailableModels([])
+          setMessages(prev => [...prev, {
+            id: crypto.randomUUID(),
+            role: 'assistant',
+            content: `**Failed to connect to Ollama.**\n\nError: ${error.message || 'Unknown error'}\n\nPlease ensure:\n- Ollama is running (\`ollama serve\`)\n- CORS is configured for Remix IDE\n- At least one model is installed\n\nSee the [Ollama Setup Guide](https://github.com/ethereum/remix-project/blob/master/OLLAMA_SETUP.md) for help.\n\n*Switching back to previous model.*`,
+            timestamp: Date.now(),
+            sentiment: 'none'
+          }])
+          // Switch back to mistralai on error
+          setAssistantChoice('mistralai')
+        }
+      } else {
+        setAvailableModels([])
+        setSelectedModel(null)
+      }
+    }
+    fetchModels()
+  }, [assistantChoice, selectedModel])
+
+  const handleSetModel = useCallback(() => {
+    dispatchActivity('button', 'setModel')
+    setShowModelOptions(prev => !prev)
+  }, [])
+
+  const handleModelSelection = useCallback(async (modelName: string) => {
+    setSelectedModel(modelName)
+    setShowModelOptions(false)
+    // Update the model in the backend
+    try {
+      await props.plugin.call('remixAI', 'setModel', modelName)
+    } catch (error) {
+      console.warn('Failed to set model:', error)
+    }
+    _paq.push(['trackEvent', 'remixAI', 'SetOllamaModel', modelName])
+  }, [props.plugin])
+
   // refresh context whenever selection changes (even if selector is closed)
   useEffect(() => {
     refreshContext(contextChoice)
@@ -530,6 +635,26 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
             />
           </div>
         )}
+        {showModelOptions && assistantChoice === 'ollama' && (
+          <div
+            className="pt-2 mb-2 z-3 bg-light border border-text w-75"
+            style={{ borderRadius: '8px' }}
+          >
+            <div className="text-uppercase ml-2 mb-2 small">Ollama Model</div>
+            <GroupListMenu
+              setChoice={handleModelSelection}
+              setShowOptions={setShowModelOptions}
+              choice={selectedModel}
+              groupList={availableModels.map(model => ({
+                label: model,
+                bodyText: `Use ${model} model`,
+                icon: 'fa-solid fa-check',
+                stateValue: model,
+                dataId: `ollama-model-${model.replace(/[^a-zA-Z0-9]/g, '-')}`
+              }))}
+            />
+          </div>
+        )}
         <PromptArea
           input={input}
           maximizePanel={maximizePanel}
@@ -540,18 +665,25 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
           setShowContextOptions={setShowContextOptions}
           showAssistantOptions={showAssistantOptions}
           setShowAssistantOptions={setShowAssistantOptions}
+          showModelOptions={showModelOptions}
+          setShowModelOptions={setShowModelOptions}
           contextChoice={contextChoice}
           setContextChoice={setContextChoice}
           assistantChoice={assistantChoice}
           setAssistantChoice={setAssistantChoice}
+          availableModels={availableModels}
+          selectedModel={selectedModel}
           contextFiles={contextFiles}
           clearContext={clearContext}
           handleAddContext={handleAddContext}
           handleSetAssistant={handleSetAssistant}
+          handleSetModel={handleSetModel}
+          handleModelSelection={handleModelSelection}
           handleGenerateWorkspace={handleGenerateWorkspace}
           dispatchActivity={dispatchActivity}
           contextBtnRef={contextBtnRef}
           modelBtnRef={modelBtnRef}
+          modelSelectorBtnRef={modelSelectorBtnRef}
           aiContextGroupList={aiContextGroupList}
           aiAssistantGroupList={aiAssistantGroupList}
           textareaRef={textareaRef}
diff --git a/libs/remix-ui/remix-ai-assistant/src/types/componentTypes.ts b/libs/remix-ui/remix-ai-assistant/src/types/componentTypes.ts
index aaeb4dd8cf2..5d38d23066a 100644
--- a/libs/remix-ui/remix-ai-assistant/src/types/componentTypes.ts
+++ b/libs/remix-ui/remix-ai-assistant/src/types/componentTypes.ts
@@ -1,6 +1,6 @@
 export type AiContextType = "none" | "current" | "opened" | "workspace"
 
-export type AiAssistantType = "openai" | "mistralai" | "anthropic"
+export type AiAssistantType = "openai" | "mistralai" | "anthropic" | "ollama"
 
 export type groupListType = {
       label: string,