Skip to content

feat(mcp): Add optional workspace_id parameter to cloud tools with conditional schema hiding#876

Merged
Aaron ("AJ") Steers (aaronsteers) merged 2 commits intomainfrom
devin/1764107552-mcp-workspace-id-param
Nov 25, 2025
Merged

feat(mcp): Add optional workspace_id parameter to cloud tools with conditional schema hiding#876
Aaron ("AJ") Steers (aaronsteers) merged 2 commits intomainfrom
devin/1764107552-mcp-workspace-id-param

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Nov 25, 2025

feat(mcp): Add optional workspace_id parameter to cloud tools with conditional schema hiding

Summary

This PR adds an optional workspace_id parameter to all 22 cloud MCP tools, enabling workspace context switching for admin/ops use cases. The parameter is conditionally hidden from the MCP schema based on whether the AIRBYTE_CLOUD_WORKSPACE_ID environment variable is set:

  • When env var IS set: workspace_id param is hidden from MCP schema (uses env var value)
  • When env var is NOT set: workspace_id param is exposed and must be provided in tool calls

This follows the same exclude_args pattern used in connector-builder-mcp for the manifest parameter.

Changes:

  • Modified _get_cloud_workspace() to accept optional workspace_id override
  • Added workspace_id parameter to all 22 cloud tools in cloud_ops.py
  • Implemented exclude_args logic in register_tools() in _tool_utils.py

Closes #875

Review & Testing Checklist for Human

  • Verify exclude_args works with FastMCP: Test that when AIRBYTE_CLOUD_WORKSPACE_ID is set, the workspace_id parameter does NOT appear in the MCP tool schema
  • Test workspace override: When env var is NOT set, verify that providing workspace_id in a tool call correctly overrides the workspace context
  • Verify resolve_cloud_workspace_id() accepts input: Confirm the underlying function in airbyte/cloud/auth.py correctly handles the optional input_value parameter
  • Check for regressions: Ensure existing behavior (using env var) still works when workspace_id is not provided

Recommended test plan:

  1. Set AIRBYTE_CLOUD_WORKSPACE_ID env var, start MCP server, verify workspace_id is NOT in tool schemas
  2. Unset env var, start MCP server, verify workspace_id IS in tool schemas
  3. With env var unset, call a tool with explicit workspace_id and verify it works

Notes

Summary by CodeRabbit

  • New Features
    • Cloud operations now accept an optional workspace identifier so actions (deploy, list, logs, updates, runs) can be scoped to a specific Airbyte Cloud workspace while keeping previous defaults.
    • If a workspace is pre-configured via environment, the workspace selection/argument is automatically hidden to simplify commands.

✏️ Tip: You can customize this high-level summary in your review settings.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

Note

Auto-merge may have been disabled. Please check the PR status to confirm.

…nditional schema hiding

- Add workspace_id parameter to all 22 cloud MCP tools
- Modify _get_cloud_workspace() to accept optional workspace_id override
- Implement exclude_args pattern in register_tools() to hide workspace_id when
  AIRBYTE_CLOUD_WORKSPACE_ID env var is set
- When env var is set, workspace_id param is hidden from MCP schema (uses env var)
- When env var is not set, workspace_id param is exposed (must be provided in tool calls)

Closes #875

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

Original prompt from AJ Steers
Received message in Slack channel #ask-devin-ai:

@Devin, don't take action on this yet, but make a plan to do so against this PR: <https://github.com/airbytehq/airbyte/pull/70202>

Your plan should include Airbyte MCP server tools (via PyAirbyte MCP) wherever possible.
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1764095206226589

@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1764107552-mcp-workspace-id-param' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1764107552-mcp-workspace-id-param'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Nov 25, 2025

📝 Walkthrough

Walkthrough

Added optional workspace_id support across cloud MCP operations and updated tool registration to conditionally hide the workspace_id argument for cloud tools when an AIRBYTE_CLOUD_WORKSPACE_ID environment variable is set.

Changes

Cohort / File(s) Summary
Tool registration
airbyte/mcp/_tool_utils.py
Added module-level constant AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET; imported inspect; updated register_tools to always build tool description and pass an exclude_args list to app.tool; when domain == "cloud" and the env var is set, inspect function signatures and exclude "workspace_id" from the registered tool schema.
Cloud operations — workspace scoping
airbyte/mcp/cloud_ops.py
_get_cloud_workspace() now accepts `workspace_id: str

Sequence Diagram(s)

sequenceDiagram
    participant Registrar as register_tools
    participant Inspect as inspect
    participant App as app.tool
    rect rgb(240,248,255)
    Note right of Registrar: Registering cloud tools\n(per-tool signature inspection)
    end
    Registrar->>Inspect: inspect.signature(callable_fn)
    Inspect-->>Registrar: parameter list
    alt AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET == True
        Registrar->>Registrar: exclude_args = ["workspace_id"]
    else AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET == False
        Registrar->>Registrar: exclude_args = None
    end
    Registrar->>App: app.tool(callable_fn, annotations=..., exclude_args=exclude_args)
    App-->>Registrar: registration complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Rationale: Many functions updated (20+), signature changes across a single module plus registration behavior that inspects runtime signatures — heterogeneous edits requiring per-function verification.
  • Pay extra attention to:
    • airbyte/mcp/_tool_utils.py — correctness of signature inspection and exclude_args logic.
    • airbyte/mcp/cloud_ops.py — consistent addition of workspace_id annotations, correct propagation into _get_cloud_workspace(workspace_id), and no accidental positional parameter changes.
    • Public API stability — ensure default behavior when workspace_id is omitted remains unchanged.
    • Tests: coverage for both env-var-set and env-var-unset registration paths.

Would you like me to suggest a small test matrix (env var set/unset × a representative cloud tool) to validate registration and runtime behavior, wdyt?

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Linked Issues check ✅ Passed All acceptance criteria from #875 are met: workspace_id parameter added to cloud tools, exclude_args logic implemented for conditional hiding, threading through _get_cloud_workspace, and backward compatibility maintained.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing the workspace_id parameter feature—two files modified with focused, relevant changes; no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The PR title clearly and specifically describes the main change: adding an optional workspace_id parameter to cloud tools with conditional schema hiding based on environment variables.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1764107552-mcp-workspace-id-param

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte/mcp/cloud_ops.py (2)

540-565: Consider adding error handling for workspace retrieval in list functions.

I noticed that list_deployed_cloud_source_connectors (and similar list functions) don't wrap the _get_cloud_workspace call in a try/except block, unlike the deploy functions. If the workspace retrieval fails (e.g., invalid workspace_id), it would raise an unhandled exception.

This might be intentional for read-only operations, but would you consider adding consistent error handling for a better user experience, wdyt?


410-421: Minor style inconsistency in parameter annotations.

I noticed that name and unique parameters lack the Annotated[..., Field(...)] wrapper that other tools use consistently, while the new workspace_id parameter does have it. This is pre-existing, but would you consider aligning the style for consistency across tools, wdyt?

 def deploy_noop_destination_to_cloud(
-    name: str = "No-op Destination",
+    name: Annotated[
+        str,
+        Field(
+            description="The name to use when deploying the destination.",
+            default="No-op Destination",
+        ),
+    ] = "No-op Destination",
     *,
     workspace_id: Annotated[
         str | None,
         Field(
             description="Workspace ID. Defaults to AIRBYTE_CLOUD_WORKSPACE_ID env var.",
             default=None,
         ),
     ] = None,
-    unique: bool = True,
+    unique: Annotated[
+        bool,
+        Field(
+            description="Whether to require a unique name.",
+            default=True,
+        ),
+    ] = True,
 ) -> str:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f1cb667 and ad201e6.

📒 Files selected for processing (2)
  • airbyte/mcp/_tool_utils.py (3 hunks)
  • airbyte/mcp/cloud_ops.py (41 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/cloud_ops.py (3)
tests/integration_tests/cloud/conftest.py (1)
  • workspace_id (49-50)
airbyte/cloud/workspaces.py (2)
  • CloudWorkspace (64-612)
  • list_custom_source_definitions (554-582)
airbyte/cloud/auth.py (1)
  • resolve_cloud_workspace_id (36-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (9)
airbyte/mcp/_tool_utils.py (3)

10-10: LGTM!

The inspect import is appropriately added to support the new signature inspection logic for excluding parameters.


29-29: LGTM!

Clean boolean flag derived from the environment variable. The strip() and bool() pattern correctly handles empty strings and whitespace.


167-179: Excellent! I have completed the verification. The web search and codebase analysis confirm that:

  1. exclude_args is a valid FastMCP parameter - Confirmed by FastMCP documentation
  2. FastMCP version supports it - Project uses fastmcp >=2.11.3,<3.0.0 (exclude_args is deprecated in 2.14+, still works)
  3. workspace_id has a default value - Found workspace_id: str | None = None, which satisfies FastMCP's constraint that only arguments with defaults can be excluded
  4. Implementation is correct - The code properly excludes workspace_id only when domain is "cloud" AND the environment variable is set

exclude_args parameter is valid and properly implemented.

The verification confirms that FastMCP's app.tool() API fully supports the exclude_args parameter. The code correctly passes None when there are no args to exclude, and properly excludes workspace_id only when both conditions are met (domain == "cloud" and AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET). The workspace_id parameter has a default value (None), satisfying FastMCP's constraint that only arguments with defaults can be excluded.

Minor note: The list comprehension on line 171 iterates over a single-element list ["workspace_id"], which works but could be simplified to a direct membership check if preferred.

airbyte/mcp/cloud_ops.py (6)

66-78: LGTM on the helper function update!

The _get_cloud_workspace function cleanly handles the optional workspace_id by delegating to resolve_cloud_workspace_id, which per the relevant code snippets handles the env var fallback. Good encapsulation.


95-101: Consistent parameter pattern.

The workspace_id parameter follows the established pattern with proper Annotated type hint and Field description. Nice work keeping it uniform across all the tools!


173-179: Consistent workspace_id propagation across deployment and sync tools.

The pattern is uniformly applied to deploy_destination_to_cloud, create_connection_on_cloud, and run_cloud_sync. Each correctly passes the optional workspace_id through to _get_cloud_workspace.

Also applies to: 265-271, 318-324


374-383: Nice addition of workspace override capability to the health check.

Adding workspace_id to check_airbyte_cloud_workspace allows users to verify connectivity to a specific workspace without relying on the environment variable. The keyword-only signature with * is appropriate here.


989-990: Safe mode check ordering is correct.

The check_guid_created_in_session is called before _get_cloud_workspace, which is the right order - it validates permission before incurring the cost of workspace authentication. Good defensive pattern!


1316-1326: Overall implementation looks solid!

The workspace_id parameter has been consistently added across all cloud operations, and the register_cloud_ops_tools function correctly delegates to register_tools which now handles the conditional parameter exclusion. This is a clean implementation of per-workspace context support.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 25, 2025

PyTest Results (Fast Tests Only, No Creds)

320 tests  +276   320 ✅ +277   5m 52s ⏱️ + 5m 15s
  1 suites ±  0     0 💤 ±  0 
  1 files   ±  0     0 ❌  -   1 

Results for commit 4203f57. ± Comparison against base commit f1cb667.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte/mcp/cloud_ops.py (1)

95-101: Consider extracting a type alias to reduce repetition, wdyt?

The workspace_id annotation is repeated identically 22 times. You could define a type alias at the module level to DRY this up:

WorkspaceIdParam = Annotated[
    str | None,
    Field(
        description="Workspace ID. Defaults to AIRBYTE_CLOUD_WORKSPACE_ID env var.",
        default=None,
    ),
]

Then each function signature becomes simply workspace_id: WorkspaceIdParam. This would make future changes (e.g., updating the description) a single-line edit rather than 22.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad201e6 and 4203f57.

📒 Files selected for processing (1)
  • airbyte/mcp/cloud_ops.py (41 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/cloud_ops.py (3)
tests/integration_tests/cloud/conftest.py (1)
  • workspace_id (49-50)
airbyte/cloud/workspaces.py (2)
  • CloudWorkspace (64-612)
  • list_custom_source_definitions (554-582)
airbyte/cloud/auth.py (1)
  • resolve_cloud_workspace_id (36-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/mcp/cloud_ops.py (3)

66-78: LGTM! Clean implementation of workspace_id threading.

The helper function correctly passes the optional workspace_id to resolve_cloud_workspace_id, which handles the fallback to the environment variable. Nice and clean!


374-403: Nice backward-compatible addition!

The check_airbyte_cloud_workspace function signature change is clean - it now accepts an optional workspace_id while maintaining backward compatibility for existing callers who rely on the environment variable.


1316-1326: The exclude_args implementation in _tool_utils.py is correctly handling workspace_id filtering.

The code at lines 167-172 properly implements the conditional hiding of workspace_id from the MCP schema:

  • Line 29 defines AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET which checks if the AIRBYTE_CLOUD_WORKSPACE_ID environment variable is set
  • Lines 169-172 conditionally build an exclude_args list that filters out workspace_id only when the env var is present and the domain is "cloud"
  • Line 178 passes exclude_args to app.tool(), which FastMCP uses to hide the parameter from the schema

This implementation ensures that when AIRBYTE_CLOUD_WORKSPACE_ID is configured, the workspace_id parameter is automatically excluded from the MCP tool schema, while still allowing it to be passed programmatically when the function is called. The logic correctly handles all 20+ cloud operation functions that now have the workspace_id parameter.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review November 25, 2025 22:15
@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title feat(mcp): Add optional workspace_id parameter to cloud tools with conditional schema hiding (do not merge) feat(mcp): Add optional workspace_id parameter to cloud tools with conditional schema hiding Nov 25, 2025
@github-actions
Copy link
Copy Markdown

PyTest Results (Full)

389 tests  ±0   372 ✅  - 1   25m 0s ⏱️ - 3m 14s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     1 ❌ +1 

For more details on these failures, see this check.

Results for commit 4203f57. ± Comparison against base commit f1cb667.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(mcp): Add optional workspace_id parameter to cloud MCP tools with conditional schema hiding

1 participant