Skip to content

Commit 35278b0

Browse files
author
Zvi Fried
committed
major llm refactor
1 parent 28c260d commit 35278b0

26 files changed

+3732
-358
lines changed

README.md

Lines changed: 201 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
<img src="assets/mcp-as-a-judge.png" alt="MCP as a Judge Logo" width="200">
55
</div>
66

7-
> **Prevent bad coding practices with AI-powered evaluation and user-driven decision making**
7+
> **MCP as a Judge acts as a validation layer between AI coding assistants and LLMs, helping ensure safer and higher-quality code.
8+
*
89

910
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
1011
[![Python 3.13+](https://img.shields.io/badge/python-3.13+-blue.svg)](https://www.python.org/downloads/)
@@ -86,7 +87,25 @@ MCP as a Judge is heavily dependent on **MCP Sampling** and **MCP Elicitation**
8687

8788
#### **System Prerequisites**
8889

89-
- **Python 3.13+** - Required for running the MCP server
90+
- **Docker Desktop** / **Python 3.13+** - Required for running the MCP server
91+
92+
#### **Supported AI Assistants**
93+
94+
| AI Assistant | Platform | MCP Support | Status | Notes |
95+
|---------------|----------|-------------|---------|-------|
96+
| **GitHub Copilot** | Visual Studio Code | ✅ Full | **Recommended** | Complete MCP integration with sampling and elicitation |
97+
| **Claude Code** | - | ⚠️ Partial | Requires LLM API key | [Sampling Support feature request](https://github.com/anthropics/claude-code/issues/1785)<br>[Elicitation Support feature request](https://github.com/anthropics/claude-code/issues/2799) |
98+
| **Cursor** | - | ⚠️ Partial | Requires LLM API key | MCP support available, but sampling/elicitation limited |
99+
| **Augment** | - | ⚠️ Partial | Requires LLM API key | MCP support available, but sampling/elicitation limited |
100+
| **Qodo** | - | ⚠️ Partial | Requires LLM API key | MCP support available, but sampling/elicitation limited |
101+
102+
**✅ Recommended Setup:** GitHub Copilot in Visual Studio Code for the best MCP as a Judge experience.
103+
104+
**⚠️ LLM API Key Requirement:**
105+
- **GitHub Copilot + VS Code**: ✅ **No API key needed** - Uses built-in MCP sampling
106+
- **All Other Assistants**: ⚠️ **Requires LLM API key** - Limited MCP sampling support
107+
108+
Configure an LLM API key (OpenAI, Anthropic, Google, etc.) as described in the [LLM API Configuration](#-llm-api-configuration-optional) section.
90109

91110

92111
#### **💡 Recommendations**
@@ -96,97 +115,221 @@ MCP as a Judge is heavily dependent on **MCP Sampling** and **MCP Elicitation**
96115

97116

98117

99-
## 🔧 **Configuration**
118+
## 🔧 **Visual Studio Code Configuration**
100119

101-
Configure **MCP as a Judge** with your preferred AI coding assistant:
120+
Configure **MCP as a Judge** in Visual Studio Code with GitHub Copilot:
102121

103-
### **Cursor**
122+
### **Method 1: Using Docker (Recommended)**
104123

105-
1. **Open Cursor Settings:**
106-
- Go to `File``Preferences``Cursor Settings`
107-
- Navigate to the `MCP` tab
108-
- Click `+ Add` to add a new MCP server
124+
1. **Configure MCP Settings:**
125+
126+
Add this to your Visual Studio Code MCP configuration file:
109127

110-
2. **Add MCP Server Configuration:**
111128
```json
112129
{
113130
"mcpServers": {
114131
"mcp-as-a-judge": {
115-
"command": "uv",
116-
"args": ["tool", "run", "mcp-as-a-judge"],
117-
"env": {}
132+
"command": "docker",
133+
"args": ["run", "--rm", "-i", "--pull=always", "ghcr.io/hepivax/mcp-as-a-judge:latest"],
134+
"env": {
135+
"LLM_API_KEY": "sk-your-api-key-here",
136+
"LLM_MODEL_NAME": "gpt-4o-mini"
137+
}
118138
}
119139
}
120140
}
121141
```
122142

123-
3. **Alternative: Edit mcp.json directly:**
124-
- Create or edit `.cursor/mcp.json` in your project or home directory
125-
- Add the server configuration above
143+
**📝 Configuration Options (All Optional):**
144+
- **LLM_API_KEY**: Optional for GitHub Copilot + VS Code (has built-in MCP sampling)
145+
- **LLM_MODEL_NAME**: Optional custom model (see [Supported LLM Providers](#supported-llm-providers) for defaults)
146+
- The `--pull=always` flag ensures you always get the latest version automatically
126147

127-
### **Claude Code**
148+
Then manually update when needed:
128149

129-
1. **Add MCP Server via CLI (Local):**
130150
```bash
131-
claude mcp add mcp-as-a-judge -- uv tool run mcp-as-a-judge
151+
# Pull the latest version
152+
docker pull ghcr.io/hepivax/mcp-as-a-judge:latest
132153
```
133154

134-
2. **Add MCP Server via CLI (Remote - Cloudflare Workers):**
155+
### **Method 2: Using uv**
156+
157+
1. **Install the package:**
158+
135159
```bash
136-
claude mcp add --transport http mcp-as-a-judge https://mcp-as-a-judge.workers.dev/mcp
160+
uv tool install mcp-as-a-judge
137161
```
138162

139-
3. **Alternative: Manual Configuration:**
140-
- Create or edit `~/.config/claude-code/mcp_servers.json`
163+
2. **Configure MCP Settings:**
164+
165+
The MCP server will be automatically detected by Visual Studio Code.
166+
167+
**📝 Notes:**
168+
- **No additional configuration needed for GitHub Copilot + VS Code** (has built-in MCP sampling)
169+
- LLM_API_KEY is optional and can be set via environment variable if needed
170+
171+
3. **To update to the latest version:**
172+
173+
```bash
174+
# Update MCP as a Judge to the latest version
175+
uv tool upgrade mcp-as-a-judge
176+
```
177+
178+
## 🔑 **LLM API Configuration (Optional)**
179+
180+
For AI assistants without full MCP sampling support (Cursor, Claude Code, Augment, Qodo), you can configure an LLM API key as a fallback. This ensures MCP as a Judge works even when the client doesn't support MCP sampling.
181+
182+
### **Supported LLM Providers**
183+
184+
| Rank | Provider | API Key Format | Default Model | Notes |
185+
|------|----------|----------------|---------------|-------|
186+
| **1** | **OpenAI** | `sk-...` | `gpt-5` | Latest frontier model with built-in reasoning |
187+
| **2** | **Anthropic** | `sk-ant-...` | `claude-sonnet-4-20250514` | High-performance with exceptional reasoning |
188+
| **3** | **Google** | `AIza...` | `gemini-2.5-pro` | Most advanced model with built-in thinking |
189+
| **4** | **Azure OpenAI** | `[a-f0-9]{32}` | `gpt-5` | Same as OpenAI but via Azure |
190+
| **5** | **AWS Bedrock** | AWS credentials | `anthropic.claude-sonnet-4-20250514-v1:0` | Aligned with Anthropic |
191+
| **6** | **Vertex AI** | Service Account JSON | `gemini-2.5-pro` | Enterprise Gemini via Google Cloud |
192+
| **7** | **Groq** | `gsk_...` | `deepseek-r1` | Best reasoning model with speed advantage |
193+
| **8** | **OpenRouter** | `sk-or-...` | `deepseek/deepseek-r1` | Best reasoning model available |
194+
| **9** | **xAI** | `xai-...` | `grok-code-fast-1` | Latest coding-focused model (Aug 2025) |
195+
| **10** | **Mistral** | `[a-f0-9]{64}` | `pixtral-large` | Most advanced model (124B params) |
196+
197+
### **🎯 Model Selection Rationale**
198+
199+
All default models are optimized for **coding and reasoning tasks** based on 2025 research:
200+
201+
- **🧠 Reasoning-Focused**: GPT-5, Claude Sonnet 4, Gemini 2.5 Pro with built-in thinking
202+
- **⚡ Speed + Quality**: DeepSeek R1 on Groq/OpenRouter for fast reasoning
203+
- **🎨 Multimodal**: Pixtral Large combines Mistral Large 2 with vision capabilities
204+
- **🚀 Coding-Specialized**: Grok Code Fast 1 designed specifically for agentic coding
205+
- **🏢 Enterprise**: AWS Bedrock and Vertex AI provide enterprise-grade access
206+
207+
### **⚙️ Coding-Optimized Configuration**
208+
209+
**Temperature: 0.1** (Low for deterministic, precise code generation)
210+
- Ensures consistent, reliable code suggestions
211+
- Reduces randomness for better debugging and maintenance
212+
- Optimized for technical accuracy over creativity
213+
214+
**Top-P: 0.9** (Nucleus sampling optimized for coding tasks)
215+
- Maintains flexibility for multiple valid coding approaches
216+
- Filters out very low-probability tokens that could break syntax
217+
- Balances precision with reasonable alternatives
218+
219+
### **🔑 When Do You Need an LLM API Key?**
220+
221+
| Coding Assistant | API Key Required? | Reason |
222+
|------------------|-------------------|---------|
223+
| **GitHub Copilot + VS Code** |**No** | Full MCP sampling support built-in |
224+
| **Claude Code** |**Yes** | Limited MCP sampling support |
225+
| **Cursor** |**Yes** | Limited MCP sampling support |
226+
| **Augment** |**Yes** | Limited MCP sampling support |
227+
| **Qodo** |**Yes** | Limited MCP sampling support |
228+
229+
**💡 Recommendation**: Use GitHub Copilot + VS Code for the best experience without needing API keys.
230+
231+
**🔍 Why Some Assistants Need API Keys:**
232+
- **MCP Sampling**: GitHub Copilot supports advanced MCP sampling for dynamic prompts
233+
- **Fallback Required**: Other assistants use LLM APIs when MCP sampling is unavailable
234+
- **Future-Proof**: As more assistants add MCP sampling support, API keys become optional
235+
236+
### **Configuration Steps**
237+
238+
1. **Restart your MCP client** to pick up the environment variables.
239+
240+
### **How It Works**
241+
242+
- **Primary**: MCP as a Judge always tries MCP sampling first (when available)
243+
- **Fallback**: If MCP sampling fails or isn't available, it uses your configured LLM API
244+
- **Automatic**: No configuration needed - the system detects your API key and selects the appropriate provider
245+
- **Privacy**: Your API key is only used when MCP sampling is not available
246+
247+
### **Client-Specific Setup**
248+
249+
#### **Cursor**
250+
251+
1. **Open Cursor Settings:**
252+
- Go to `File``Preferences``Cursor Settings`
253+
- Navigate to the `MCP` tab
254+
- Click `+ Add` to add a new MCP server
255+
256+
2. **Add MCP Server Configuration:**
141257
```json
142258
{
143259
"mcpServers": {
144260
"mcp-as-a-judge": {
145261
"command": "uv",
146-
"args": ["tool", "run", "mcp-as-a-judge"]
262+
"args": ["tool", "run", "mcp-as-a-judge"],
263+
"env": {
264+
"LLM_API_KEY": "sk-your-api-key-here",
265+
"LLM_MODEL_NAME": "gpt-4.1"
266+
}
147267
}
148268
}
149269
}
150270
```
151271

152-
### **VS Code (GitHub Copilot)**
272+
**📝 Configuration Options:**
273+
- **LLM_API_KEY**: Required for Cursor (limited MCP sampling)
274+
- **LLM_MODEL_NAME**: Optional custom model (see [Supported LLM Providers](#supported-llm-providers) for defaults)
275+
276+
#### **Claude Code**
153277

154-
1. **Install via Command Palette:**
155-
- Open Command Palette (`Ctrl+Shift+P` / `Cmd+Shift+P`)
156-
- Run `MCP: Add Server`
157-
- Choose `Workspace Settings` or `Global`
158-
- Enter server details
278+
1. **Add MCP Server via CLI:**
279+
```bash
280+
# Set environment variables first (optional model override)
281+
export LLM_API_KEY="sk-ant-your-api-key-here"
282+
export LLM_MODEL_NAME="claude-3-5-haiku" # Optional: faster/cheaper model
159283

160-
2. **Manual Configuration:**
161-
- Create `.vscode/mcp.json` in your workspace, or
162-
- Edit global config via `MCP: Open User Configuration`
284+
# Add MCP server
285+
claude mcp add mcp-as-a-judge -- uv tool run mcp-as-a-judge
286+
```
287+
288+
2. **Alternative: Manual Configuration:**
289+
- Create or edit `~/.config/claude-code/mcp_servers.json`
163290
```json
164291
{
165-
"servers": {
292+
"mcpServers": {
166293
"mcp-as-a-judge": {
167294
"command": "uv",
168-
"args": ["tool", "run", "mcp-as-a-judge"]
295+
"args": ["tool", "run", "mcp-as-a-judge"],
296+
"env": {
297+
"LLM_API_KEY": "sk-ant-your-api-key-here",
298+
"LLM_MODEL_NAME": "claude-3-5-haiku"
299+
}
169300
}
170301
}
171302
}
172303
```
173304

174-
### **Installation Prerequisites**
305+
**📝 Configuration Options:**
306+
- **LLM_API_KEY**: Required for Claude Code (limited MCP sampling)
307+
- **LLM_MODEL_NAME**: Optional custom model (see [Supported LLM Providers](#supported-llm-providers) for defaults)
175308

176-
Before configuring, install the MCP server:
309+
#### **Other MCP Clients**
177310

178-
```bash
179-
# Install with uv (recommended)
180-
uv tool install mcp-as-a-judge
181-
```
182-
183-
3. **To update to the latest version:**
311+
For other MCP-compatible clients, use the standard MCP server configuration:
184312

185-
```bash
186-
# Update MCP as a Judge to the latest version
187-
uv tool upgrade mcp-as-a-judge
313+
```json
314+
{
315+
"mcpServers": {
316+
"mcp-as-a-judge": {
317+
"command": "uv",
318+
"args": ["tool", "run", "mcp-as-a-judge"],
319+
"env": {
320+
"LLM_API_KEY": "sk-your-api-key-here",
321+
"LLM_MODEL_NAME": "gpt-4o-mini"
322+
}
323+
}
324+
}
188325
```
189326

327+
**📝 Configuration Options:**
328+
- **LLM_API_KEY**: Required for most MCP clients (except GitHub Copilot + VS Code)
329+
- **LLM_MODEL_NAME**: Optional custom model (see [Supported LLM Providers](#supported-llm-providers) for defaults)
330+
331+
332+
190333

191334
## 📖 **How It Works**
192335

@@ -239,13 +382,22 @@ Once MCP as a Judge is configured with your AI coding assistant, it automaticall
239382
- **User-driven decisions** - You're involved whenever your original request cannot be satisfied
240383
- **Professional standards** - Consistent application of software engineering best practices
241384

242-
## 🔒 **Privacy & API Key Free**
385+
## 🔒 **Privacy & Flexible AI Integration**
243386

244-
### **🔑 No LLM API Key Required**
387+
### **🔑 MCP Sampling (Preferred) + LLM API Key Fallback**
245388

389+
**Primary Mode: MCP Sampling**
246390
- All judgments are performed using **MCP Sampling** capability
247391
- No need to configure or pay for external LLM API services
248392
- Works directly with your MCP-compatible client's existing AI model
393+
- **Currently supported by:** GitHub Copilot + VS Code
394+
395+
**Fallback Mode: LLM API Key**
396+
- When MCP sampling is not available, the server can use LLM API keys
397+
- Supports multiple providers via LiteLLM: OpenAI, Anthropic, Google, Azure, Groq, Mistral, xAI
398+
- Automatic vendor detection from API key patterns
399+
- Default model selection per vendor when no model is specified
400+
- Set environment variables like `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc.
249401

250402
### **🛡️ Your Privacy Matters**
251403

@@ -300,8 +452,7 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
300452
## 🙏 **Acknowledgments**
301453

302454
- [Model Context Protocol](https://modelcontextprotocol.io/) by Anthropic
303-
- The amazing MCP community for inspiration and best practices
304-
- All developers who will benefit from better coding practices
455+
- [LiteLLM](https://github.com/BerriAI/litellm) for unified LLM API integration
305456

306457
---
307458

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ dependencies = [
3131
"mcp[cli]>=1.13.0",
3232
"pydantic>=2.0.0",
3333
"jinja2>=3.1.0",
34+
"litellm>=1.0.0",
3435
]
3536

3637
[project.urls]
@@ -104,6 +105,7 @@ ignore = [
104105
"B008", # do not perform function calls in argument defaults
105106
"S101", # use of assert detected
106107
"T201", # print found
108+
"S110", # try-except-pass detected (intentional in fallback scenarios)
107109
]
108110

109111
[tool.ruff.format]

src/mcp_as_a_judge/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,19 @@
77

88
__version__ = "1.0.0"
99

10+
1011
# Lazy imports to avoid dependency issues in Cloudflare Workers
1112
def __getattr__(name):
1213
if name == "JudgeResponse":
1314
from mcp_as_a_judge.models import JudgeResponse
15+
1416
return JudgeResponse
1517
elif name == "mcp":
1618
from mcp_as_a_judge.server import mcp
19+
1720
return mcp
1821
elif name == "main":
1922
from mcp_as_a_judge.server import main
23+
2024
return main
2125
raise AttributeError(f"module '{__name__}' has no attribute '{name}'")

0 commit comments

Comments
 (0)