Skip to content

[ROADMAP][1.12] Implement MCP servers for research agent and document retrieval #564

@MODSetter

Description

@MODSetter

Feature Description

Build Model Context Protocol (MCP) servers to expose SurfSense's research agent capabilities and document retrieval as "context as a service". This allows other AI applications to leverage SurfSense's powerful research and retrieval features.

Target Deployment

  • SurfSense Cloud (hosted version)
  • Self-hosted version

Problem Statement

SurfSense's powerful retrieval and research capabilities are currently only accessible through its own interface. By implementing MCP servers, we can:

  • Allow other AI tools to access SurfSense's document store
  • Enable research agent usage from any MCP-compatible client
  • Position SurfSense as infrastructure for AI applications
  • Enable powerful integrations with tools like Claude, Cursor, etc.

Proposed Solution

MCP Server Architecture

┌─────────────────────────────────────────────────────────┐
│              MCP-Compatible Clients                      │
│  [Claude Desktop] [Cursor] [Other MCP Clients]          │
│         │            │            │                      │
│         └────────────┼────────────┘                      │
│                      │                                   │
│                      ▼                                   │
│  ┌─────────────────────────────────────────────────┐    │
│  │         SurfSense MCP Servers                    │    │
│  │                                                  │    │
│  │  ┌─────────────────┐  ┌─────────────────┐      │    │
│  │  │ Retrieval Server │  │ Research Server │      │    │
│  │  │                  │  │                  │      │    │
│  │  │ - search_docs    │  │ - research      │      │    │
│  │  │ - get_document   │  │ - deep_research │      │    │
│  │  │ - list_spaces    │  │ - summarize     │      │    │
│  │  └─────────────────┘  └─────────────────┘      │    │
│  │                      │                          │    │
│  │                      ▼                          │    │
│  │         ┌─────────────────────┐                │    │
│  │         │   SurfSense Core    │                │    │
│  │         │   (API/Database)    │                │    │
│  │         └─────────────────────┘                │    │
│  └─────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────┘

MCP Tools to Implement

Retrieval Server

// Tools
- search_documents: Search across user's documents
- get_document: Retrieve specific document by ID
- list_search_spaces: List available search spaces
- get_recent_documents: Get recently updated documents
- semantic_search: Perform semantic search with filters

// Resources
- document://{space_id}/{doc_id}: Document content
- search_space://{space_id}: Search space metadata

Research Server

// Tools
- quick_research: Fast research on a topic
- deep_research: Comprehensive multi-step research
- summarize_documents: Summarize specific documents
- compare_documents: Compare multiple sources
- extract_facts: Extract key facts from documents

// Resources
- research_history://{user_id}: Past research results

Authentication

  • API key-based authentication
  • Per-user access scoping
  • Rate limiting
  • Audit logging

Benefits

  • Extends SurfSense's reach to entire MCP ecosystem
  • "Context as a Service" business model potential
  • Powerful integrations with AI tools
  • API-first architecture benefits
  • Community-driven use cases

Use Case Examples

  1. User asks Claude to research using their SurfSense documents
  2. Cursor uses SurfSense for codebase-related document retrieval
  3. Custom AI app leverages SurfSense's research capabilities
  4. Team shares research infrastructure via MCP

Implementation Considerations

  • This may require backend changes (MCP server implementation)
  • This may require frontend changes (configuration UI)
  • This may require database changes
  • This may affect existing features (uses existing retrieval)

Files to Create

  • surfsense_mcp/ - New MCP server package
  • surfsense_mcp/retrieval_server.py - Retrieval MCP server
  • surfsense_mcp/research_server.py - Research MCP server
  • surfsense_mcp/auth.py - Authentication handling
  • Documentation for MCP setup

Acceptance Criteria

  • MCP servers implement required protocol
  • Authentication works correctly
  • Tools are discoverable by MCP clients
  • Search/retrieval functions correctly
  • Research agent accessible via MCP
  • Rate limiting is enforced
  • Comprehensive documentation provided
  • Example configurations for popular clients

Technical Notes

  • Use official MCP Python SDK
  • Implement stdio and HTTP transports
  • Consider SSE for streaming results
  • Add comprehensive error messages
  • Document self-hosting requirements

Resources

Related Issues

  • Depends on: Issues 1.1-1.5 (retriever improvements, deep agent)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions