Skip to content

feat(mcp) add MCP control for Cap.#1942

Open
onyedikachi-david wants to merge 1 commit into
CapSoftware:mainfrom
onyedikachi-david:codex/cap-desktop-mcp
Open

feat(mcp) add MCP control for Cap.#1942
onyedikachi-david wants to merge 1 commit into
CapSoftware:mainfrom
onyedikachi-david:codex/cap-desktop-mcp

Conversation

@onyedikachi-david

@onyedikachi-david onyedikachi-david commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

A quick demo:
Fixes: #1943
https://github.com/user-attachments/assets/cfb44b64-e736-4379-85e5-f1debe664347

Greptile Summary

This PR adds full MCP (Model Context Protocol) support to Cap, enabling AI agents and external tools to control Cap Desktop via both a local stdio shim (cap mcp) and a direct HTTP transport.

  • Desktop MCP server (apps/desktop/src-tauri/src/mcp.rs): A new ~2000-line Axum-based HTTP server runs inside the desktop app, exposing 40+ tools behind bearer-token authentication and session management with a 30-minute TTL.
  • CLI stdio shim (apps/cli/src/mcp.rs): cap mcp reads the MCP port and token from the desktop store on disk and proxies stdin/stdout JSON-RPC messages to the local HTTP server.
  • Settings UI (apps/desktop/src/routes/(window-chrome)/settings/cli.tsx): A new MCP section in CLI settings lets users enable/disable the server, rotate the token, and copy ready-to-paste client configs for both local and HTTP transports.

Confidence Score: 3/5

The change is broadly well-structured and the auth/session plumbing is sound, but the unknown-tool error response is semantically wrong and could cause MCP client sessions to abort instead of continuing after a bad tool call.

The desktop MCP server is a large new subsystem. The most notable issue is that an unrecognized tool name in tools/call returns a JSON-RPC protocol error (-32602) rather than an isError: true tool result — strict MCP clients may abort the session instead of handling it gracefully. The other findings are cosmetic or low-risk. The auth model is solid and the stdio shim logic is clean and well-tested.

apps/desktop/src-tauri/src/mcp.rs — specifically the execute_tool unknown-tool branch and the token comparison in validate_authorization.

Security Review

  • Non-constant-time token comparison (validate_authorization): Bearer tokens are compared with standard string equality rather than a constant-time function. Practical risk is very low (localhost only, 73-char token).
  • No secrets are stored in source code; tokens are generated at runtime with two UUID v4 values.
  • The validate_origin function correctly blocks non-local origins as a secondary CSRF defense.
  • Tools such as open_external_link and start_video_import accept arbitrary URLs and paths but all require a valid bearer token.

Important Files Changed

Filename Overview
apps/desktop/src-tauri/src/mcp.rs New 2034-line MCP HTTP server with bearer-token auth, session management, 40+ tool handlers, and JSON-RPC dispatch. Unknown tool returns wrong error code and there is a non-constant-time token comparison.
apps/cli/src/mcp.rs New stdio-to-HTTP shim: reads port/token from the desktop store, forwards JSON-RPC lines to the local MCP server, and cleans up the session on exit.
apps/desktop/src-tauri/src/lib.rs Wires up three new MCP Tauri commands, calls mcp::init at startup, and adds a paused field to CurrentRecording.
apps/desktop/src/routes/(window-chrome)/settings/cli.tsx Adds MCP settings UI. HTTP config preview renders literal null values when the server has never been started.
apps/desktop/src/utils/tauri.ts Adds three new MCP invoke stubs and exports McpServerConfig type. Purely additive.
apps/cli/src/main.rs Registers the new mcp module and adds the Mcp CLI command variant.

Comments Outside Diff (1)

  1. apps/desktop/src-tauri/src/mcp.rs, line 2359-2385 (link)

    P2 rejects_invalid_tool_schema test does not exercise schema validation

    Despite its name, this test never reaches the schema-validation path. handle_request checks for app.is_none() and returns -32603 ("Cap Desktop app handle is unavailable") before calling execute_tool. The actual invalid-argument path — ToolFailure::InvalidParams-32602 — is never asserted.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/desktop/src-tauri/src/mcp.rs
    Line: 2359-2385
    
    Comment:
    **`rejects_invalid_tool_schema` test does not exercise schema validation**
    
    Despite its name, this test never reaches the schema-validation path. `handle_request` checks for `app.is_none()` and returns `-32603` ("Cap Desktop app handle is unavailable") before calling `execute_tool`. The actual invalid-argument path — `ToolFailure::InvalidParams``-32602` — is never asserted.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
apps/desktop/src-tauri/src/mcp.rs:1267
**Unknown tool falls through as "Invalid params"**

When a caller invokes a tool name that doesn't exist, `execute_tool` returns `ToolFailure::InvalidParams`, which `handle_request` maps to JSON-RPC error code `-32602` ("Invalid params"). MCP clients that inspect the error code cannot distinguish "you passed bad arguments" from "that tool doesn't exist," and the MCP spec recommends returning an `isError: true` tool result (not a protocol-level error) for tool-level failures such as an unrecognized name. Under the current mapping, strict clients may treat the response as a protocol violation rather than a tool error and abort the session.

### Issue 2 of 4
apps/desktop/src/routes/(window-chrome)/settings/cli.tsx:78-90
**HTTP config preview renders literal `null` values when server has never been enabled**

When `config.token` is `null` (server was never enabled), the template literal produces `"Authorization": "Bearer null"` and `"url": null` in the preview. While the Copy button is disabled in this state, the displayed snippet is misleading. Guard the config display so it only renders the HTTP tab content when both `endpoint` and `token` are non-null.

```suggestion
	const httpMcpClientConfig = (config: McpServerConfig) => {
		if (!config.endpoint || !config.token) {
			return "Enable MCP and start the server to view the HTTP configuration.";
		}
		return JSON.stringify(
			{
				cap: {
					url: config.endpoint,
					headers: {
						Authorization: `Bearer ${config.token}`,
					},
				},
			},
			null,
			2,
		);
	};
```

### Issue 3 of 4
apps/desktop/src-tauri/src/mcp.rs:2359-2385
**`rejects_invalid_tool_schema` test does not exercise schema validation**

Despite its name, this test never reaches the schema-validation path. `handle_request` checks for `app.is_none()` and returns `-32603` ("Cap Desktop app handle is unavailable") before calling `execute_tool`. The actual invalid-argument path — `ToolFailure::InvalidParams``-32602` — is never asserted.

### Issue 4 of 4
apps/desktop/src-tauri/src/mcp.rs:1432
**Non-constant-time bearer token comparison**

`auth.strip_prefix("Bearer ") != Some(token)` is a standard string equality check, which can leak timing information. Even on localhost, a sufficiently capable adversary making many rapid requests could theoretically exploit this. The `subtle` crate provides `ConstantTimeEq` for this pattern. Given the token is 73 characters and the server is localhost-only, the practical risk is very low, but aligning with best practice here is straightforward.

Reviews (1): Last reviewed commit: "feat(mcp) add MCP control for Cap." | Re-trigger Greptile

Greptile also left 3 inline comments on this PR.

@superagent-security

Copy link
Copy Markdown

Superagent didn't find any vulnerabilities or security issues in this PR.

no_args_schema(),
ToolAnnotations::mutating(),
),
tool(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unknown tool falls through as "Invalid params"

When a caller invokes a tool name that doesn't exist, execute_tool returns ToolFailure::InvalidParams, which handle_request maps to JSON-RPC error code -32602 ("Invalid params"). MCP clients that inspect the error code cannot distinguish "you passed bad arguments" from "that tool doesn't exist," and the MCP spec recommends returning an isError: true tool result (not a protocol-level error) for tool-level failures such as an unrecognized name. Under the current mapping, strict clients may treat the response as a protocol violation rather than a tool error and abort the session.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/mcp.rs
Line: 1267

Comment:
**Unknown tool falls through as "Invalid params"**

When a caller invokes a tool name that doesn't exist, `execute_tool` returns `ToolFailure::InvalidParams`, which `handle_request` maps to JSON-RPC error code `-32602` ("Invalid params"). MCP clients that inspect the error code cannot distinguish "you passed bad arguments" from "that tool doesn't exist," and the MCP spec recommends returning an `isError: true` tool result (not a protocol-level error) for tool-level failures such as an unrecognized name. Under the current mapping, strict clients may treat the response as a protocol violation rather than a tool error and abort the session.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +78 to +90
const httpMcpClientConfig = (config: McpServerConfig) =>
JSON.stringify(
{
cap: {
url: config.endpoint,
headers: {
Authorization: `Bearer ${config.token}`,
},
},
},
null,
2,
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 HTTP config preview renders literal null values when server has never been enabled

When config.token is null (server was never enabled), the template literal produces "Authorization": "Bearer null" and "url": null in the preview. While the Copy button is disabled in this state, the displayed snippet is misleading. Guard the config display so it only renders the HTTP tab content when both endpoint and token are non-null.

Suggested change
const httpMcpClientConfig = (config: McpServerConfig) =>
JSON.stringify(
{
cap: {
url: config.endpoint,
headers: {
Authorization: `Bearer ${config.token}`,
},
},
},
null,
2,
);
const httpMcpClientConfig = (config: McpServerConfig) => {
if (!config.endpoint || !config.token) {
return "Enable MCP and start the server to view the HTTP configuration.";
}
return JSON.stringify(
{
cap: {
url: config.endpoint,
headers: {
Authorization: `Bearer ${config.token}`,
},
},
},
null,
2,
);
};
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src/routes/(window-chrome)/settings/cli.tsx
Line: 78-90

Comment:
**HTTP config preview renders literal `null` values when server has never been enabled**

When `config.token` is `null` (server was never enabled), the template literal produces `"Authorization": "Bearer null"` and `"url": null` in the preview. While the Copy button is disabled in this state, the displayed snippet is misleading. Guard the config display so it only renders the HTTP tab content when both `endpoint` and `token` are non-null.

```suggestion
	const httpMcpClientConfig = (config: McpServerConfig) => {
		if (!config.endpoint || !config.token) {
			return "Enable MCP and start the server to view the HTTP configuration.";
		}
		return JSON.stringify(
			{
				cap: {
					url: config.endpoint,
					headers: {
						Authorization: `Bearer ${config.token}`,
					},
				},
			},
			null,
			2,
		);
	};
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

"copy_video_to_clipboard",
"Copy Video To Clipboard",
"Copy a local video file path to the clipboard.",
path_schema("path"),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 security Non-constant-time bearer token comparison

auth.strip_prefix("Bearer ") != Some(token) is a standard string equality check, which can leak timing information. Even on localhost, a sufficiently capable adversary making many rapid requests could theoretically exploit this. The subtle crate provides ConstantTimeEq for this pattern. Given the token is 73 characters and the server is localhost-only, the practical risk is very low, but aligning with best practice here is straightforward.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/mcp.rs
Line: 1432

Comment:
**Non-constant-time bearer token comparison**

`auth.strip_prefix("Bearer ") != Some(token)` is a standard string equality check, which can leak timing information. Even on localhost, a sufficiently capable adversary making many rapid requests could theoretically exploit this. The `subtle` crate provides `ConstantTimeEq` for this pattern. Given the token is 73 characters and the server is localhost-only, the practical risk is very low, but aligning with best practice here is straightforward.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread apps/cli/src/mcp.rs
let mut request = client
.post(&endpoint.endpoint)
.bearer_auth(&endpoint.token)
.header("Accept", "application/json, text/event-stream")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re advertising text/event-stream here but the client always tries to parse the response body as JSON. If the desktop endpoint ever switches to SSE for streaming responses, this will fail; consider restricting the Accept header to JSON unless you’re going to implement SSE parsing.

Suggested change
.header("Accept", "application/json, text/event-stream")
.header("Accept", "application/json")

Comment on lines +321 to +329
let mut server = runtime.server.lock().await;
if let Some(handle) = server.as_ref() {
settings.port = Some(handle.port);
settings.endpoint = Some(endpoint_for_port(handle.port));
save_settings(app, &settings)?;
return Ok(());
}

let listener = bind_listener(settings.port).await?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime.server.lock().await is held across the bind_listener(...).await call here. It’s probably fine in practice, but it’s an easy way to introduce lock contention / deadlocks later; I’d drop the lock before awaiting and then re-acquire when writing the handle.

Suggested change
let mut server = runtime.server.lock().await;
if let Some(handle) = server.as_ref() {
settings.port = Some(handle.port);
settings.endpoint = Some(endpoint_for_port(handle.port));
save_settings(app, &settings)?;
return Ok(());
}
let listener = bind_listener(settings.port).await?;
{
let mut server = runtime.server.lock().await;
if let Some(handle) = server.as_ref() {
settings.port = Some(handle.port);
settings.endpoint = Some(endpoint_for_port(handle.port));
save_settings(app, &settings)?;
return Ok(());
}
}
let listener = bind_listener(settings.port).await?;

@@ -2265,15 +2267,18 @@ async fn get_current_recording(
) -> Result<JsonValue<Option<CurrentRecording>>, ()> {
let state = state.read().await;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This state.read().await guard is held through the whole match below, including an .await (inner.is_paused().await). Holding an async lock across an await can cause subtle contention/deadlocks; consider extracting what you need and dropping the guard before awaiting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add local MCP server integration for Cap Desktop

1 participant