A remote MCP (Model Context Protocol) server that exposes Harper Cortex memory and development context as tools to Claude, Cursor, Windsurf, Copilot, and any MCP-compatible client.
This is the lowest-friction entry point into the Harper ecosystem. Users can add persistent, distributed memory to any AI tool by pointing it at a public URL — no local installation, no CLI, no code required.
- Persistent Memory — Store and retrieve facts, decisions, and context using semantic search
- Multi-Client Support — Works with Claude (web, desktop, mobile), Claude Code, Cursor, Windsurf, Copilot, and any MCP-compatible client
- Server-Side Embeddings — No API keys needed on the client; Cortex handles all embedding with ONNX
- Multi-Agent Sharing — Share memory across agents and users with namespace isolation
- Self-Hosted — Deploy on your own infrastructure or Harper Cloud
- Production-Ready — Real database with ACID guarantees, no local files or SQLite limitations
- Open Settings → Connectors → Add custom connector
- Enter the URL:
https://my-instance.harpercloud.com/mcp - (Optional) Authenticate with Bearer token if required
- You now have persistent memory tools available in Claude
# Install cortex-mcp-server on your machine or server
npm install -g @harperfast/cortex-mcp-server
# Add it as an MCP server
claude mcp add cortex -- cortex-mcp-server --url https://my-instance.harpercloud.comAdd to your MCP configuration file (.cursor/mcp.json or similar):
{
"mcpServers": {
"cortex": {
"url": "https://my-instance.harpercloud.com/mcp",
"env": {
"CORTEX_TOKEN": "your-bearer-token"
}
}
}
}# Clone the repo
git clone https://github.com/HarperFast/cortex-mcp-server.git
cd cortex-mcp-server
# Install dependencies
npm install
# Start in HTTP mode
npm run dev
# Or use npx directly
npx @harperfast/cortex-mcp-server --url https://my-cortex.harpercloud.com --port 3000CORTEX_URL(required) — URL of your Cortex instance (e.g.,https://my-instance.harpercloud.com)CORTEX_TOKEN(optional) — Bearer token for authenticationCORTEX_SCHEMA(optional) — Schema name in Cortex (default:data)PORT(optional) — Port to listen on for HTTP server (default:3000)HOST(optional) — Host to bind to (default:0.0.0.0)AUTH_REQUIRED(optional) — Require authentication (default:true)HTTP_SERVER(optional) — Use HTTP transport instead of stdio (default:false)MULTI_TENANT(optional) — Set to "true" to enable multi-tenant mode (JWT auth, namespace enforcement, rate limiting)JWKS_URL(optional) — JWKS endpoint for JWT validation (required in multi-tenant mode)ADMIN_TOKEN(optional) — Static token for admin API access
cortex-mcp-server \
--url https://my-cortex.harpercloud.com \
--token your-bearer-token \
--port 3000 \
--host localhost \
--no-auth \
--multi-tenant \
--jwks-url <url> \
--admin-token <token>WARNING: Running with --no-auth exposes all memory data without authentication. Only use in isolated development environments.
| Tool | Description | Input | Output |
|---|---|---|---|
memory_search |
Search memories by semantic similarity | query, limit?, filters? |
Results with similarity scores |
memory_store |
Store a new memory | text, source?, classification?, metadata? |
Memory ID and timestamp |
memory_recall |
Retrieve a specific memory by ID | id |
Full memory record |
memory_forget |
Delete a memory | id |
Deletion confirmation |
memory_count |
Count stored memories | filters? |
Total count |
synapse_search |
Search development context | query, projectId, limit?, filters? |
Context entries with scores |
synapse_ingest |
Ingest context from a tool | source, content, projectId, parentId?, references? |
Stored entries and count |
| Tool | Description |
|---|---|
admin_create_tenant |
Create a new tenant with namespace and security policy |
admin_list_tenants |
List all tenants, optionally filtered by status |
admin_get_tenant |
Get details for a specific tenant |
admin_update_tenant |
Update tenant name, tier, status, or quotas |
admin_issue_token |
Generate JWT claims for a tenant |
admin_revoke_token |
Revoke a specific JWT token |
You: "Remember that we use event-driven architecture for our order service"
Claude: [Uses memory_store] "Stored. I've saved that your order service uses event-driven architecture."
You: "What's the architecture for the order service?"
Claude: [Uses memory_search] "Based on our notes, your order service uses event-driven architecture."
When ingesting context from your codebase:
[Claude/Cursor detects you're working on authentication]
[Uses synapse_ingest] Stores: "Intent: Implement JWT-based auth"
Later:
[Uses synapse_search] Retrieves: "Previous decision: JWT-based auth with 24h expiry"
npx @harperfast/cortex-mcp-server --url https://my-cortex.harpercloud.com --port 3000Runs on any Node.js host (VPS, laptop, container orchestration).
# Build
docker build -t cortex-mcp-server:latest .
# Run
docker run \
-e CORTEX_URL=https://my-cortex.harpercloud.com \
-e CORTEX_TOKEN=your-token \
-p 3000:3000 \
cortex-mcp-server:latestDeploy directly on Harper:
harper deploy cortex-mcp-serverThe MCP server runs in the same instance as your Cortex data, with zero additional infrastructure.
version: '3.8'
services:
cortex-mcp:
image: harperfast/cortex-mcp-server:latest
environment:
CORTEX_URL: https://my-cortex.harpercloud.com
CORTEX_TOKEN: ${CORTEX_TOKEN}
PORT: 3000
ports:
- "3000:3000"
restart: unless-stoppedThe server supports Bearer token authentication:
Authorization: Bearer <your-token>
In multi-tenant setups, include the user ID in the token:
Authorization: Bearer user-123:secret-token
The server extracts user-123 and scopes all memory operations to that user's namespace.
Auth Layering: Cortex relies on Harper/Fabric platform authentication. Ensure authentication.requireAuthentication is enabled in your Harper config to enforce security at the instance level.
┌─────────────────────────────────────┐
│ Claude / Cursor / Windsurf / etc. │
│ (MCP-compatible client) │
└────────────┬────────────────────────┘
│
│ Streamable HTTP or Stdio
│ MCP Protocol
│
┌────────────▼────────────────────────┐
│ cortex-mcp-server │
│ │
│ ├─ memory_search │
│ ├─ memory_store │
│ ├─ memory_recall │
│ ├─ memory_forget │
│ ├─ memory_count │
│ ├─ synapse_search │
│ └─ synapse_ingest │
└────────────┬────────────────────────┘
│
│ HTTP + Bearer auth
│ @harperfast/cortex-client
│
┌────────────▼────────────────────────┐
│ Harper Cortex │
│ (Memory + Synapse database) │
│ │
│ ├─ Vector Search (ONNX) │
│ ├─ Metadata Filtering │
│ ├─ Multi-agent Namespaces │
│ └─ ACID Transactions │
└─────────────────────────────────────┘
git clone https://github.com/HarperFast/cortex-mcp-server.git
cd cortex-mcp-server
npm install
npm run buildnpm testnpm run devThis starts the server in HTTP mode with hot reload. By default, it connects to http://localhost:8000 for Cortex.
- Check that
CORTEX_URLis correct and the Cortex instance is running - Verify network connectivity:
curl https://my-cortex.harpercloud.com/api/health
- Ensure
CORTEX_TOKENis set correctly if your Cortex instance requires auth - Check that the token hasn't expired
- Verify Cortex is using a persistent database (not in-memory)
- Check that the
CORTEX_SCHEMAmatches your Cortex configuration
- Restart your MCP client after deploying a new version
- Check that the server is running and reachable:
curl http://localhost:3000/health
If the MCP connection fails, Claude may silently fall back to local file-based memory. Verify the connection is active by:
curl http://localhost:3000/mcp/healthOr use Claude's built-in diagnostic:
/mcp
This shows all connected MCP servers and their status.
For detailed API specifications, see cortex-client.
Operators are responsible for ensuring Prohibited Data (PII, PHI, government IDs) is not stored unless covered by their Harper Order. This is specified in PaaS ToS Section 3.3.
All memory storage operations are protected by content sanitization that detects and blocks or sanitizes injection patterns, control characters, and oversized payloads. However, this protection assumes legitimate data. Do not store sensitive personal information without explicit legal coverage.
The MCP server is designed for single-tenant deployment: one Cortex instance per team. Auth is handled by Harper's native HTTP auth layer (Basic auth or Bearer tokens configured at the instance level). The MCP server inherits this — no additional auth is needed beyond what Harper provides.
VectorSearch is intentionally excluded from MCP. The VectorSearch endpoint accepts pre-computed embedding vectors and is available for trusted server-to-server paths (e.g., LangChain running in your backend). It is not exposed through MCP because untrusted clients could craft adversarial vectors to poison the vector space or trick dedup into overwriting legitimate memories.
For multi-tenant deployments where multiple users share a single Cortex instance, the server implements:
- JWT auth with RS256 JWKS validation — Tokens validated against JWKS endpoint for secure, stateless auth
- Server-side namespace enforcement — agentId bound from JWT ns claim, client values overwritten
- Per-tenant rate limiting with 3 tiers:
- Free: 60 reads/20 writes per minute
- Team: 300/100 per minute
- Enterprise: 1000/500 per minute
- Scope-based access control — memory:read, memory:write, synapse:read, synapse:write
- Token revocation with 60s cache TTL
- Content audit logging — All operations logged for compliance
Important: Ensure all tenants are provisioned as authorized Users under your Harper subscription (per PaaS ToS Section 3.2).
See docs/multi-tenant-design.md for the full architecture proposal.
All memory storage operations pass through content sanitization that:
- Detects and strips prompt injection patterns (system markers, instruction overrides, delimiter injection)
- Removes script tags, SQL-like injection, and control characters
- Enforces content length limits (16KB)
- Normalizes Unicode (NFKC)
- Blocks content with detected injection patterns (configurable: block vs. sanitize-and-store)
Retrieval also applies a lighter sanitization pass to prevent stored payloads from reaching LLM clients.
Embedding generation and vector search are compute-intensive. In production:
- Harper deployment: Configure rate limits at the Harper instance level via
config.yamlor Fabric policies - Standalone deployment: Place an HTTP rate limiter (nginx, Cloudflare, express-rate-limit) in front of the MCP server
- Rate limits should be per-tenant for multi-tenant deployments, with separate budgets for reads vs. writes
- Internal/team use: Deploy alongside Cortex in your private network. No DMZ needed.
- Public-facing: Place the MCP server in a DMZ with strict ingress controls. Cortex must NOT be directly accessible from the internet. The MCP server acts as the auth boundary.
- OpenShell/NemoClaw: These environments block internal IPs by default. Cortex must be reachable at a routable HTTPS address. Use the standalone deployment mode with
CORTEX_URLpointing to your public Cortex endpoint.
When deployed as a Harper component, the MCP endpoint runs inside the Cortex process with direct table access. This eliminates network round-trips and inherits Harper's auth, TLS, and rate limiting automatically. See the harper/ directory.
MIT
Contributions welcome! Please open issues and PRs on GitHub.
- Harper Cortex — The memory database powering this server
- cortex-client — TypeScript SDK for Cortex
- LangChain Harper Integration — Production RAG with Cortex
- OpenClaw — Multi-agent orchestration with shared memory
- Docs: https://harperdb.io/docs
- Discord: https://discord.gg/harperdb
- Email: support@harperfast.io