🚀 Local AI-powered code completion and development assistant - A free, privacy-focused alternative to GitHub Copilot
Transform your coding experience with AI-powered assistance that runs entirely on your machine. No cloud dependencies, no data sharing, no subscription fees - just powerful AI helping you code faster and smarter.
- Real-time inline suggestions - Gray text appears as you type
- Context-aware completions - Understands your entire codebase
- Press Tab to accept - Familiar workflow
- Multi-language support - JavaScript, Python, TypeScript, Java, C#, Go, Rust, and more
- Full conversation interface for coding questions
- Code generation with clickable "Insert" buttons
- Debug assistance and error explanations
- Architecture discussions and best practices
- 🧪 Generate Unit Tests - Comprehensive test suites with edge cases
- 📝 Auto Documentation - JSDoc, docstrings, and inline comments
- 💡 Code Explanations - Understand complex algorithms instantly
- 🔧 Code Refactoring - Improve performance and readability
- 🩹 Bug Detection & Fixes - Identify and resolve issues automatically
- 100% Local - Your code never leaves your machine
- No API keys required - No cloud dependencies
- GPU acceleration - Lightning-fast responses with NVIDIA GPUs
- Offline capable - Works without internet connection
- Multiple model sizes - From ultra-fast (300MB) to high-quality (3.8GB)
Option A: Docker (Recommended)
# CPU only
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# With GPU acceleration (NVIDIA)
docker run -d --gpus all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollamaOption B: Direct Installation
- Windows/Mac: Download from ollama.ai
- Linux:
curl https://ollama.ai/install.sh | sh
# Ultra-fast model for code completion (300MB)
docker exec -it ollama ollama pull qwen2:0.5b
# Balanced model for code completion (637MB)
docker exec -it ollama ollama pull tinyllama:1.1b
# High-quality model for chat and explanations (3.8GB)
docker exec -it ollama ollama pull codellama:7b-instructFrom VS Code Marketplace:
- Open VS Code
- Go to Extensions (
Ctrl+Shift+X) - Search for "AI Developer Agent"
- Click "Install"
From GitHub Releases:
- Download latest
.vsixfrom Releases Ctrl+Shift+P→ "Extensions: Install from VSIX"- Select the downloaded file
Open VS Code Settings (Ctrl+,) and search for "AI Dev Agent", or add to settings.json:
{
"aiDevAgent.ollamaUrl": "http://localhost:11434",
"aiDevAgent.model": "tinyllama:1.1b",
"aiDevAgent.chat.model": "codellama:7b-instruct",
"aiDevAgent.codeCompletion.enabled": true
}- Type code → Gray suggestions appear → Press Tab to accept
- Select code → Right-click → Choose AI action
- Open AI Chat →
Ctrl+Shift+P→ "AI Agent: Open Chat"
| Setting | Default | Description |
|---|---|---|
aiDevAgent.ollamaUrl |
http://localhost:11434 |
Ollama server URL |
aiDevAgent.model |
tinyllama:1.1b |
Model for code completion |
aiDevAgent.chat.model |
codellama:7b-instruct |
Model for chat interface |
aiDevAgent.codeCompletion.enabled |
true |
Enable/disable code completion |
| Setting | Default | Description |
|---|---|---|
aiDevAgent.keepAlive |
-1 |
Keep model loaded (-1 = indefinite) |
aiDevAgent.maxTokens |
2000 |
Maximum response length |
aiDevAgent.temperature |
0.1 |
Creativity (0 = consistent, 1 = creative) |
aiDevAgent.codeCompletion.timeoutMs |
6000 |
Completion timeout (milliseconds) |
aiDevAgent.codeCompletion.debounceMs |
800 |
Delay before triggering completion |
aiDevAgent.codeCompletion.maxLines |
2 |
Max lines in completion |
Ultra-Fast Setup (GPU recommended):
{
"aiDevAgent.model": "qwen2:0.5b",
"aiDevAgent.codeCompletion.timeoutMs": 3000,
"aiDevAgent.codeCompletion.maxLines": 1,
"aiDevAgent.keepAlive": "-1"
}Balanced Quality & Speed:
{
"aiDevAgent.model": "tinyllama:1.1b",
"aiDevAgent.chat.model": "codellama:7b-instruct",
"aiDevAgent.codeCompletion.timeoutMs": 6000,
"aiDevAgent.codeCompletion.maxLines": 2,
"aiDevAgent.keepAlive": "-1"
}Maximum Quality (Powerful hardware):
{
"aiDevAgent.model": "codellama:7b-code",
"aiDevAgent.chat.model": "deepseek-r1:7b",
"aiDevAgent.codeCompletion.timeoutMs": 10000,
"aiDevAgent.codeCompletion.maxLines": 3,
"aiDevAgent.maxTokens": 4000
}def fibonacci(n):
if n <= 1:
return n
# Type here - AI suggests: return fibonacci(n-1) + fibonacci(n-2)- Gray text appears as you type
- Press Tab to accept suggestion
- Press Escape to dismiss
- Context-aware - understands your entire file
- Open Chat:
Ctrl+Shift+P→ "AI Agent: Open Chat" - Ask questions: "How do I implement binary search in Python?"
- Generate code: "Create a React login component with validation"
- Debug help: "Why is my async function not working?"
- Insert code: Click "Insert" button to add code to editor
- 🧪 Generate Unit Tests - Select function → Right-click → "Generate Unit Tests"
- 📝 Generate Documentation - Select code → Right-click → "Generate Documentation"
- 💡 Explain Code - Select complex code → Right-click → "Explain Code"
- 🔧 Refactor Code - Select code → Right-click → "Refactor Code"
- 🩹 Fix Code Issues - Select buggy code → Right-click → "Fix Code Issues"
| Model | Size | Speed (CPU) | Speed (GPU) | Quality | Use Case |
|---|---|---|---|---|---|
qwen2:0.5b |
300MB | 3-5s | 0.5-1s | ⭐⭐⭐ | Ultra-fast completion |
tinyllama:1.1b |
637MB | 4-7s | 1-2s | ⭐⭐⭐⭐ | Recommended balance |
qwen2.5-coder:1.5b |
900MB | 6-10s | 2-3s | ⭐⭐⭐⭐⭐ | High-quality completion |
codellama:7b-code |
3.8GB | 15-30s | 3-5s | ⭐⭐⭐⭐⭐ | Best quality (slow) |
| Model | Size | Speed | Quality | Use Case |
|---|---|---|---|---|
codellama:7b-instruct |
3.8GB | 5-15s | ⭐⭐⭐⭐⭐ | Recommended for chat |
deepseek-r1:7b |
7GB | 30-60s | ⭐⭐⭐⭐⭐ | Complex reasoning |
llama3.1:8b |
4.7GB | 10-20s | ⭐⭐⭐⭐ | General purpose |
❌ "Cannot connect to Ollama"
# Check if Ollama is running
docker ps | grep ollama
# Start Ollama if not running
docker start ollama
# Test connection
curl http://localhost:11434/api/tags❌ "Model not found"
# List available models
docker exec -it ollama ollama list
# Pull missing model
docker exec -it ollama ollama pull tinyllama:1.1b❌ "Completion timeout"
- Increase timeout: Set
aiDevAgent.codeCompletion.timeoutMsto10000 - Use faster model: Switch to
qwen2:0.5b - Check Docker resources: Allocate 4GB+ memory to Docker
❌ "Code completion not working"
- Check model is loaded:
docker exec -it ollama ollama ps - Verify settings: Ensure
aiDevAgent.codeCompletion.enabledistrue - Restart extension:
Ctrl+Shift+P→ "Developer: Reload Window"
🚀 GPU Acceleration (NVIDIA)
# Stop current container
docker stop ollama && docker rm ollama
# Start with GPU support
docker run -d --gpus all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# Verify GPU usage
nvidia-smi # Should show GPU activity during completion💾 Memory Optimization
- Docker Desktop: Settings → Resources → Memory (6GB+)
- Keep models loaded: Set
aiDevAgent.keepAliveto"-1" - Use smaller models:
qwen2:0.5bfor resource-constrained systems
Fully Supported Languages:
- JavaScript/TypeScript
- Python
- Java
- C#
- Go
- Rust
- C/C++
- PHP
- Ruby
- Swift
- Kotlin
Partial Support:
- HTML/CSS
- SQL
- Shell/Bash
- Markdown
- JSON/YAML
We welcome contributions! Please see our Contributing Guide.
# Clone repository
git clone https://github.com/hariprasathys22/AI-Agent.git
cd AI-Agent
# Install dependencies
npm install
# Build extension
npm run compile
# Test extension
# Press F5 in VS Code to open Extension Development Host- 🐛 Bug reports: Create issue
- 💡 Feature requests: Create issue
- ❓ Questions: GitHub Discussions
- Multi-file context - Understand project structure
- Jupyter Notebook support - AI assistance in notebooks
- Git integration - Smart commit messages and PR descriptions
- Custom model training - Fine-tune on your codebase
- Collaborative features - Team model sharing
- More languages - Expand language support
| Feature | AI Dev Agent | GitHub Copilot | TabNine | CodeWhisperer |
|---|---|---|---|---|
| Privacy | ✅ 100% Local | ❌ Cloud-based | ❌ Cloud-based | ❌ Cloud-based |
| Cost | ✅ Free | ❌ $10/month | ❌ $12/month | ✅ Free tier |
| Offline | ✅ Yes | ❌ No | ❌ No | ❌ No |
| GPU Support | ✅ Yes | N/A | N/A | N/A |
| Custom Models | ✅ Yes | ❌ No | ❌ No | ❌ No |
| Chat Interface | ✅ Yes | ✅ Yes | ❌ No | ❌ No |
| Code Actions | ✅ Yes | ❌ Limited | ❌ No | ❌ Limited |
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama - For providing the local AI runtime
- VS Code team - For the excellent extension API
- Hugging Face - For open-source model hosting
- Contributors - Thank you to all who have contributed!
- 📚 Documentation: GitHub Wiki
- 💬 Community: GitHub Discussions
- 🐛 Issues: GitHub Issues
- 🔄 Updates: Release Notes
⭐ If this extension helps you code faster, please star the repository!
Made with ❤️ for developers who value privacy and performance