Skip to content

Feature Request: New /recommendations Command for Provider and Model Recommendations #43

@will-lamerton

Description

@will-lamerton

New /recommendations Command for Provider and Model Recommendations

Description

This document outlines the implementation plan to create a new /recommendations command that analyzes the user's system and provides intelligent suggestions for AI providers and models based on system capabilities and requirements.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update

Background

Currently, users must manually research and experiment to find the best AI models for their system and use cases. This new /recommendations command will provide intelligent, personalized suggestions based on:

  • System resources (CPU, memory, GPU availability)
  • Network connectivity
  • Model capabilities for specific tasks (coding vs. agentic workflows)
  • Cost considerations (free local vs. paid API models)
  • Performance trade-offs

Implementation Plan

1. System Detection Module

File: source/system/detector.ts

Create a new system detection module that analyzes:

  • Hardware capabilities: CPU cores, available RAM, GPU presence (NVIDIA/AMD/Apple Silicon)
  • Network status: Connection speed, stability for API-based providers
  • Local AI availability: Ollama installation, available local models
  • Platform specifics: macOS Metal, CUDA support, etc.

2. Provider Recommendation Engine

File: source/recommendations/provider-engine.ts

Develop recommendation logic that suggests providers based on:

  • Local-first preference: Prioritize Ollama for privacy-conscious users
  • Performance requirements: Fast local models vs. powerful cloud models
  • Cost considerations: Free vs. paid options
  • Specific use cases: Code generation, chat, analysis

3. Model Recommendation Database

File: source/recommendations/model-database.ts

Maintain a curated database of models with detailed capability assessments:

  • System requirements: Minimum RAM, GPU requirements, CPU cores
  • Capability ratings: Coding quality, agentic task performance, context handling
  • Use case suitability: Simple tasks, complex workflows, long-form coding
  • Performance characteristics: Speed, accuracy, reliability scores

4. Model Matching Engine

File: source/recommendations/model-engine.ts

Implement intelligent model matching that:

  • Filters by system compatibility: Only show models the system can run
  • Warns about limitations: Alert when capable models exist but can't run locally
  • Provides capability context: Explain what each model is good/bad at
  • Suggests alternatives: Local vs cloud options for different needs

5. New Recommendations Command

File: source/commands/recommendations.ts

Create a new /recommendations command that provides:

  • System capability summary
  • Recommended providers with reasoning
  • Suggested models for different use cases
  • Setup instructions for recommended options
  • Cost analysis and trade-offs
  • Performance expectations for agentic tasks

6. Configuration Integration

File: source/config/recommendations.ts

Add configuration options for:

  • Recommendation preferences (privacy, performance, cost)
  • System override settings
  • Cached recommendation results
  • User feedback on recommendations

Technical Implementation

System Detection

interface SystemCapabilities {
  cpu: {
    cores: number;
    architecture: string;
  };
  memory: {
    total: number;
    available: number;
  };
  gpu: {
    available: boolean;
    type: 'nvidia' | 'amd' | 'apple' | 'intel' | 'none';
    memory?: number;
  };
  platform: NodeJS.Platform;
  network: {
    connected: boolean;
    speed?: 'slow' | 'medium' | 'fast';
  };
  ollama: {
    installed: boolean;
    running: boolean;
    models: string[];
  };
}

Model Database Schema

interface ModelEntry {
  name: string;
  provider: string;
  size: string; // "7B", "13B", "70B", "Unknown" for API models
  type: "local" | "api";
  requirements: {
    minMemory: number; // GB (minimal for API models)
    recommendedMemory: number; // GB
    minCpuCores: number;
    gpuRequired: boolean;
    gpuMemory?: number; // GB
  };
  capabilities: {
    codingQuality: 1 | 2 | 3 | 4 | 5; // 1=basic, 5=excellent
    agenticTasks: 1 | 2 | 3 | 4 | 5; // 1=poor, 5=excellent
    contextHandling: 1 | 2 | 3 | 4 | 5; // 1=limited, 5=excellent
    longFormCoding: 1 | 2 | 3 | 4 | 5; // 1=struggles, 5=excellent
    toolUsage: 1 | 2 | 3 | 4 | 5; // 1=basic, 5=advanced
  };
  useCases: {
    quickQuestions: boolean;
    simpleEdits: boolean;
    complexRefactoring: boolean;
    multiFileProjects: boolean;
    longWorkflows: boolean;
  };
  limitations: string[]; // ["Requires internet", "Usage costs apply"]
  downloadSize: number; // GB (0 for API models)
  cost: {
    type: "free" | "pay-per-use" | "subscription";
    details: string; // Pricing details
    estimatedDaily?: string; // Estimated daily cost for typical usage
  };
}

interface ProviderRecommendation {
  provider: string;
  priority: 'high' | 'medium' | 'low';
  reasoning: string[];
  setupInstructions: string;
  models: ModelRecommendation[];
}

interface ModelRecommendation {
  model: ModelEntry;
  compatibility: 'perfect' | 'good' | 'marginal' | 'incompatible';
  warnings: string[]; // ["May be slow on your system", "Limited agentic capabilities"]
  recommendation: string; // "Excellent for complex coding tasks"
}

Recommendations Command Output

The /recommendations command will display:

  1. System Summary: Brief overview of detected capabilities
  2. Quick Start: Top recommendation with one-line setup
  3. Compatible Models: Models that can run on your system with capability ratings
  4. Upgrade Recommendations: Better models available with hardware upgrades
  5. Cloud Alternatives: API-based options for resource-constrained systems
  6. Capability Warnings: Clear explanations of model limitations for agentic tasks

Example output:

System: 16GB RAM, Apple M2, 8 cores

🚀 RECOMMENDED FOR YOUR SYSTEM:
✅ qwen2.5-coder:7b (Local, Free) - Good coding, limited for complex workflows
💰 deepseek-coder-v2.5 (API, $0.10-1.00/day) - Excellent coding & agentic tasks

📋 OTHER OPTIONS:
⚠️  llama3.1:8b (Local, Free) - Decent coding, poor agentic performance
💎 claude-3.5-sonnet (API, $0.50-5.00/day) - Best overall, premium cost
❌ qwen2.5-coder:32b (Local, Free) - Excellent but needs 24GB+ RAM

💡 RECOMMENDATIONS:
• For budget-conscious: Start with qwen2.5-coder:7b locally
• For best results: Use deepseek-coder-v2.5 via OpenRouter
• For complex workflows: Consider Claude 3.5 Sonnet (premium)

Testing Strategy

Manual Testing

  • Different Systems: Test on various hardware configurations

    • Low-end laptops (limited RAM/CPU)
    • High-end workstations (GPU available)
    • Apple Silicon Macs
    • Linux servers
  • Network Scenarios: Test with different connection types

    • High-speed broadband
    • Mobile/limited bandwidth
    • Offline environments
  • Ollama States: Test various local AI setups

    • Fresh install (no Ollama)
    • Ollama installed but not running
    • Various local models available

Integration Testing

  • Verify recommendations work with existing provider system
  • Test recommendation caching and updates
  • Ensure backward compatibility with current help command

Code Style Considerations

  • Follow existing TypeScript patterns in the codebase
  • Use proper error handling for system detection failures
  • Implement graceful degradation when detection fails
  • Cache expensive system checks to avoid repeated calls
  • Use existing configuration and preferences systems

Documentation Updates

  • Update README with new /recommendations command
  • Add examples of recommendation output in help documentation
  • Document configuration options for recommendations
  • Include troubleshooting guide for system detection issues
  • Update help command to mention /recommendations for model guidance

Implementation Timeline

  1. Phase 1: System detection module (CPU, memory, platform)
  2. Phase 2: Model database with curated entries and capability ratings
  3. Phase 3: Model matching engine that filters by system compatibility
  4. Phase 4: Provider recommendation engine with model integration
  5. Phase 5: New /recommendations command implementation with clear capability warnings
  6. Phase 6: Configuration, caching, and optimization
  7. Phase 7: Testing across different system configurations
  8. Phase 8: Documentation and refinement

Sample Model Database Entries

const MODEL_DATABASE: ModelEntry[] = [
  // Local Models (Ollama)
  {
    name: "llama3.1:8b",
    provider: "ollama",
    size: "8B",
    type: "local",
    requirements: {
      minMemory: 8,
      recommendedMemory: 16,
      minCpuCores: 4,
      gpuRequired: false,
      gpuMemory: 6
    },
    capabilities: {
      codingQuality: 4,
      agenticTasks: 2, // Limited for complex workflows
      contextHandling: 4,
      longFormCoding: 3,
      toolUsage: 3
    },
    useCases: {
      quickQuestions: true,
      simpleEdits: true,
      complexRefactoring: true,
      multiFileProjects: false,
      longWorkflows: false
    },
    limitations: [
      "May lose context in long conversations",
      "Limited planning abilities for multi-step tasks"
    ],
    downloadSize: 4.7,
    cost: { type: "free", details: "Local inference only" }
  },
  {
    name: "qwen2.5-coder:32b",
    provider: "ollama",
    size: "32B",
    type: "local",
    requirements: {
      minMemory: 24,
      recommendedMemory: 32,
      minCpuCores: 8,
      gpuRequired: false,
      gpuMemory: 20
    },
    capabilities: {
      codingQuality: 5,
      agenticTasks: 4,
      contextHandling: 5,
      longFormCoding: 5,
      toolUsage: 4
    },
    useCases: {
      quickQuestions: true,
      simpleEdits: true,
      complexRefactoring: true,
      multiFileProjects: true,
      longWorkflows: true
    },
    limitations: [
      "Requires significant RAM",
      "Slower inference on CPU-only systems"
    ],
    downloadSize: 19,
    cost: { type: "free", details: "Local inference only" }
  },

  // API Models (OpenRouter/OpenAI)
  {
    name: "claude-3.5-sonnet",
    provider: "openrouter",
    size: "Unknown",
    type: "api",
    requirements: {
      minMemory: 1, // Just needs to run the client
      recommendedMemory: 2,
      minCpuCores: 1,
      gpuRequired: false
    },
    capabilities: {
      codingQuality: 5,
      agenticTasks: 5, // Excellent for complex workflows
      contextHandling: 5,
      longFormCoding: 5,
      toolUsage: 5
    },
    useCases: {
      quickQuestions: true,
      simpleEdits: true,
      complexRefactoring: true,
      multiFileProjects: true,
      longWorkflows: true
    },
    limitations: [
      "Requires internet connection",
      "Usage costs apply"
    ],
    downloadSize: 0,
    cost: {
      type: "pay-per-use",
      details: "$3/1M input tokens, $15/1M output tokens",
      estimatedDaily: "$0.50-5.00 for typical coding sessions"
    }
  },
  {
    name: "gpt-4o",
    provider: "openai",
    size: "Unknown",
    type: "api",
    requirements: {
      minMemory: 1,
      recommendedMemory: 2,
      minCpuCores: 1,
      gpuRequired: false
    },
    capabilities: {
      codingQuality: 5,
      agenticTasks: 4,
      contextHandling: 5,
      longFormCoding: 5,
      toolUsage: 5
    },
    useCases: {
      quickQuestions: true,
      simpleEdits: true,
      complexRefactoring: true,
      multiFileProjects: true,
      longWorkflows: true
    },
    limitations: [
      "Requires internet connection",
      "Higher cost than OpenRouter"
    ],
    downloadSize: 0,
    cost: {
      type: "pay-per-use",
      details: "$2.50/1M input tokens, $10/1M output tokens",
      estimatedDaily: "$1-10 for typical coding sessions"
    }
  },
  {
    name: "deepseek-coder-v2.5",
    provider: "openrouter",
    size: "236B",
    type: "api",
    requirements: {
      minMemory: 1,
      recommendedMemory: 2,
      minCpuCores: 1,
      gpuRequired: false
    },
    capabilities: {
      codingQuality: 5,
      agenticTasks: 4,
      contextHandling: 5,
      longFormCoding: 5,
      toolUsage: 4
    },
    useCases: {
      quickQuestions: true,
      simpleEdits: true,
      complexRefactoring: true,
      multiFileProjects: true,
      longWorkflows: true
    },
    limitations: [
      "Requires internet connection"
    ],
    downloadSize: 0,
    cost: {
      type: "pay-per-use",
      details: "$0.14/1M input tokens, $0.28/1M output tokens",
      estimatedDaily: "$0.10-1.00 for typical coding sessions"
    }
  }
];

Success Criteria

  • /recommendations command provides relevant, actionable recommendations
  • System detection works reliably across platforms
  • Recommendations improve user onboarding experience
  • Performance impact is minimal (< 100ms for recommendations command)
  • Graceful handling of edge cases and errors

Potential Challenges

  • Cross-platform system detection reliability
  • Accurate performance prediction for different models
  • Balancing recommendation complexity vs. simplicity
  • Handling cases where no good recommendations exist
  • Maintaining recommendations as new providers/models emerge

Future Enhancements

  • Machine learning-based recommendations from usage patterns
  • Community-driven model ratings and reviews
  • Integration with model performance benchmarks
  • Automatic provider/model switching based on task type
  • Integration with external model repositories and registries

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions