feat(mcp): Add organization info to check_airbyte_cloud_workspace tool#897
Conversation
- Add get_workspace_with_org_info function to api_util.py that retrieves workspace details including organization ID and name - Update check_airbyte_cloud_workspace MCP tool to return CloudWorkspaceResult with workspace ID, workspace name, organization ID, and organization name - Add organization_name field to CloudWorkspaceResult model Co-Authored-By: AJ Steers <aj@airbyte.io>
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765226177-mcp-check-workspace-org-info' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765226177-mcp-check-workspace-org-info'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
📝 WalkthroughWalkthroughReplaces string-based workspace checks with a structured CloudWorkspaceResult (renamed fields, added optional workspace_url and organization_name). check_airbyte_cloud_workspace now resolves credentials, calls api_util.get_workspace and api_util.get_workspace_organization_info, and returns a populated CloudWorkspaceResult. cloud workspace model caches organization info. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant CloudOps as CloudOps.check_airbyte_cloud_workspace
participant APIUtil as APIUtil
participant ConfigAPI as Config API
Caller->>CloudOps: check_airbyte_cloud_workspace(workspace_id, ...)
CloudOps->>CloudOps: resolve api_root, client_id, client_secret
CloudOps->>APIUtil: get_workspace(workspace_id, api_root, client_id, client_secret)
APIUtil->>ConfigAPI: GET /workspaces/{id}
ConfigAPI-->>APIUtil: 200 OK (workspace data)
APIUtil-->>CloudOps: workspace dict
CloudOps->>APIUtil: get_workspace_organization_info(workspace_id, api_root, client_id, client_secret)
APIUtil->>ConfigAPI: POST /workspaces/get_organization_info { "workspaceId": workspace_id }
ConfigAPI-->>APIUtil: 200 OK (organization data)
APIUtil-->>CloudOps: org info dict
CloudOps->>CloudOps: assemble CloudWorkspaceResult(workspace_id, workspace_name, workspace_url, organization_id, organization_name)
CloudOps-->>Caller: CloudWorkspaceResult
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…space info Addresses review feedback: The previous implementation iterated through all organizations and workspaces which would fail at scale for Airbyte team members with access to thousands of customer workspaces. Changes: - Remove get_workspace_with_org_info function from api_util.py - Update check_airbyte_cloud_workspace to use existing get_workspace API - Return workspace ID and name from public API - Set organization_id/organization_name to empty/None (not available from public API without expensive lookup) Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/mcp/cloud_ops.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/cloud_ops.py (3)
airbyte/cloud/workspaces.py (1)
CloudWorkspace(64-651)airbyte/cloud/auth.py (3)
resolve_cloud_api_url(25-33)resolve_cloud_client_id(17-22)resolve_cloud_client_secret(9-14)airbyte/_util/api_util.py (1)
get_workspace(100-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/mcp/cloud_ops.py (3)
474-485: Good decision to use direct workspace lookup instead of expensive organization iteration.The implementation correctly resolves credentials and calls
api_util.get_workspace()directly, which avoids the performance concern mentioned in the PR objectives about iterating through all organizations and workspaces. This is a much more efficient approach for the common case.
161-162: Consider documenting thatorganization_nameis currently unavailable or defer field addition.The
organization_namefield is a useful addition to the model, but if the implementation cannot populate it (due to API constraints), clarify this in the field's docstring or defer adding the field until organization data is actually available. Leaving unpopulated fields can create confusion for users who see them in the model but never receive values.
468-468: Return type change improves API with structured data.The function now returns
CloudWorkspaceResultinstead of a string, providing structured workspace information (id, name, organization details). Since MCP tools operate through JSON-RPC serialization via FastMCP, Pydantic models are natively supported and serialize cleanly to JSON. This is not a breaking change for MCP consumers—it's an improvement that provides more useful, well-defined data instead of an unstructured string.
PyTest Results (Full)416 tests ±0 399 ✅ - 1 26m 6s ⏱️ -13s For more details on these failures, see this check. Results for commit a7a0331. ± Comparison against base commit 69f6c96. ♻️ This comment has been updated with latest results. |
Found the /v1/workspaces/get_organization_info Config API endpoint which provides an efficient O(1) lookup to get organization info for a workspace. Changes: - Add get_workspace_organization_info function to api_util.py that calls the Config API endpoint POST /workspaces/get_organization_info - Update check_airbyte_cloud_workspace to use the new function to return organization_id and organization_name This replaces the previous brute-force approach that would have iterated through all organizations and workspaces. Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte/mcp/cloud_ops.py (1)
459-499: Workspace URL no longer surfaced from this tool; still acceptable?The updated implementation returns a structured
CloudWorkspaceResultbut does not expose aworkspace_urlfield, so consumers lose the direct link that older string responses included. Since we already haveworkspace: CloudWorkspacein scope, would it be worth adding aworkspace_urlfield toCloudWorkspaceResultand setting it fromworkspace.workspace_urlto preserve that affordance for agents, or are we intentionally dropping it here, wdyt?
🧹 Nitpick comments (3)
airbyte/_util/api_util.py (1)
1677-1710: New Config API helper looks consistent; consider tightening the contract?The helper is nicely aligned with the existing
_make_config_api_requestpattern and solves the O(1) lookup need. Would it be worth either (a) documenting in the docstring that this can raiseAirbyteErroron non-2xx responses, or (b) in the future, wrapping the rawdict[str, Any]in a small typed model so callers don’t rely on magic keys like"organizationId"/"organizationName"everywhere, wdyt?airbyte/mcp/cloud_ops.py (2)
152-162: CloudWorkspaceResult shape now includesorganization_name; confirm expectations aroundorganization_idemptiness?Nice addition making
organization_nameoptional so we can populate it where available. Given that some callers (likecheck_airbyte_cloud_workspace) now have a reliableorganizationIdand others may not, do we still expectorganization_idto be strictly non-empty, or should we consider allowingstr | Nonethere as well so “unknown” is explicitly represented instead of"", wdyt?
459-499: check_airbyte_cloud_workspace now does real org lookup; consider failure behavior and consumer impact?The new flow (public API workspace fetch + Config API
get_organization_info+ structuredCloudWorkspaceResult) looks good and fixes the earlier brute-force concern.Two follow-ups to consider:
- Failure behavior for org lookup: Right now, if
/workspaces/get_organization_infofails (auth, permissions, feature flag, etc.), the whole tool fails. For a “check connectivity + basic info” tool, would you prefer to catchAirbyteErroraroundget_workspace_organization_infoand still return workspaceid/namewithorganization_id/organization_nameleft unset, so users don’t lose the basic check when org metadata is unavailable, wdyt?- Breaking return type change: Since the tool now returns
CloudWorkspaceResultinstead of astr, have we confirmed that all MCP consumers ofcheck_airbyte_cloud_workspaceare prepared to handle the structured result (and that any prompt templates / expectations of a plain string were updated), wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/_util/api_util.py(1 hunks)airbyte/mcp/cloud_ops.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-11T19:53:44.427Z
Learnt from: yohannj
Repo: airbytehq/PyAirbyte PR: 716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
Applied to files:
airbyte/mcp/cloud_ops.py
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (1)
airbyte/secrets/base.py (1)
SecretString(38-143)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
…kspace_name Per review feedback, use more explicit field names: - id -> workspace_id - name -> workspace_name Co-Authored-By: AJ Steers <aj@airbyte.io>
Per review feedback, use '[error: organization ID not discovered]' instead of empty string when organization ID is not found. Co-Authored-By: AJ Steers <aj@airbyte.io>
…CloudWorkspace - Added cached_property _organization_info that fetches org info via Config API - Added organization_id and organization_name properties that use the cached result - Updated check_airbyte_cloud_workspace MCP tool to use CloudWorkspace properties instead of calling get_workspace_organization_info directly Co-Authored-By: AJ Steers <aj@airbyte.io>
…k error) Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
airbyte/mcp/cloud_ops.py (1)
459-491: Implementation uses indirect API calls via CloudWorkspace properties; verify breaking change consumer impact.The function correctly fetches workspace details and organization info through CloudWorkspace properties. The
organization_idandorganization_nameproperties internally call theget_workspace_organization_infoendpoint (POST/v1/workspaces/get_organization_info) via a cached property, ensuring efficient lookups.The return type has changed from
strtoCloudWorkspaceResult. Please confirm:
- Breaking change: All MCP consumers have been updated to handle the structured
CloudWorkspaceResultresponse instead of the previous string format.
🧹 Nitpick comments (1)
airbyte/mcp/cloud_ops.py (1)
152-162: Model changes look good; consider organization_id type and workspace_url.The field renames to
workspace_idandworkspace_nameimprove clarity, and the neworganization_namefield is properly typed as optional. A couple of points to consider:
workspace_url removal: Past review comments noted that the previous implementation returned
workspace_url, but it's not included in this model. TheCloudWorkspaceclass has aworkspace_urlproperty that could be included here. The PR objectives mention verifying this removal is acceptable—have you confirmed with MCP consumers that they don't need the URL, wdyt?organization_id type: The field is typed as
str(notstr | None), butworkspace.organization_idcan returnNone. The implementation handles this with the fallback string"[error: organization ID not discovered]", but semantically it's a bit unusual to have an error message as a valid ID value. Would it be cleaner to makeorganization_id: str | Noneand let consumers handleNoneexplicitly?Based on past review comments, this was previously discussed. Have the concerns been resolved?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/cloud/workspaces.py(2 hunks)airbyte/mcp/cloud_ops.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-11T19:53:44.427Z
Learnt from: yohannj
Repo: airbytehq/PyAirbyte PR: 716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
Applied to files:
airbyte/mcp/cloud_ops.py
🧬 Code graph analysis (1)
airbyte/mcp/cloud_ops.py (3)
airbyte/cloud/workspaces.py (2)
organization_id(101-106)organization_name(109-114)airbyte/cloud/auth.py (3)
resolve_cloud_api_url(25-33)resolve_cloud_client_id(17-22)resolve_cloud_client_secret(9-14)airbyte/_util/api_util.py (1)
get_workspace(100-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte/cloud/workspaces.py (3)
39-39: LGTM on the import addition!The
cached_propertyimport is correctly placed and necessary for the organization info caching functionality below.
100-106: Clean implementation of organization_id property!The property correctly returns
str | Noneand uses.get("organizationId")to safely handle cases where the organization ID isn't available. The caching via_organization_infoensures efficient lookups.
108-114: LGTM on organization_name property!Consistent implementation with
organization_id. The property safely handles missing organization names and leverages the cached lookup.
|
Devin, I need you to add back the workspace id in the result of the MCP tool. Looks like that got dropped |
|
Hi AJ, I just tested the MCP tool locally using {
"workspace_id": "19d7a891-8e0e-40ac-8a8c-5faf8d11e47c",
"workspace_name": "aj-dataops-demos",
"organization_id": "664c690e-5263-49ba-b01f-4a6759b3330a",
"organization_name": "Airbyte Team"
}Are you perhaps testing against the published PyAirbyte package (v0.33.6) instead of the PR branch? The published version wouldn't have these changes yet. To test the PR branch, you can use: poe mcp-tool-test check_airbyte_cloud_workspace '{"workspace_id": "YOUR_WORKSPACE_ID"}'Or install from the PR branch: pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765226177-mcp-check-workspace-org-info'Let me know if you're seeing something different! |
Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/mcp/cloud_ops.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-11T19:53:44.427Z
Learnt from: yohannj
Repo: airbytehq/PyAirbyte PR: 716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
Applied to files:
airbyte/mcp/cloud_ops.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/mcp/cloud_ops.py (1)
470-494: Implementation is correct - CloudWorkspace properties are properly defined.The structured
CloudWorkspaceResultreturn type is a solid improvement over the previous string return. The implementation correctly combines data fromworkspace_response(for workspace details) and theworkspaceobject (for URL and organization info).The
workspace.organization_idandworkspace.organization_nameproperties are both defined inCloudWorkspaceat lines 36–50 ofairbyte/cloud/workspaces.py. Both are implemented as@propertydecorators that retrieve values from a@cached_propertymethod (_organization_info), which lazily fetches organization data viaapi_util.get_workspace_organization_info(). Since these properties returnstr | None, the fallback message"[error: organization ID not discovered]"in the code correctly handles theNonecase.One note: The return type change from
strtoCloudWorkspaceResultis a breaking change for existing MCP tool consumers—this is documented in the PR objectives, so awareness is good.
…esult fields Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte/mcp/cloud_ops.py (2)
461-494: check_airbyte_cloud_workspace behavior and payload look solid; two tiny polish ideasThe new implementation cleanly:
- Resolves credentials,
- Verifies the workspace via the API,
- Returns a structured
CloudWorkspaceResultincluding URL and org info (with a clear fallback string when org lookup fails).Functionally this looks good to ship. Two minor follow‑ups you might consider:
- The docstring (Line 473) only calls out ID, name, and org info. Since you now also return
workspace_url, would it be helpful to mention that explicitly so tool consumers know it’s available, wdyt?- You resolve
api_root,client_id, andclient_secretseparately from_get_cloud_workspace, even thoughCloudWorkspacealready holds these. Would reusing the values fromworkspace(for a single source of truth and one fewer env read) be a small simplification worth doing here, or is the explicit re‑resolution intentional, wdyt?
1259-1264: list_cloud_workspaces mapping now matches the updated model; optional enrichmentUsing
workspace_id=andworkspace_name=here fixes the earlier mismatch with the model and should avoid any construction issues; this looks correct to me.One optional enhancement: if the
list_workspaces_in_organizationAPI ever exposes fields that would let you populateworkspace_urland/ororganization_namewithout extra round‑trips, would you want to enrich theCloudWorkspaceResulthere for parity withcheck_airbyte_cloud_workspace, or is keeping the list payload minimal an intentional choice, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/mcp/cloud_ops.py(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-11T19:53:44.427Z
Learnt from: yohannj
Repo: airbytehq/PyAirbyte PR: 716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
Applied to files:
airbyte/mcp/cloud_ops.py
🧬 Code graph analysis (1)
airbyte/mcp/cloud_ops.py (3)
airbyte/cloud/workspaces.py (4)
workspace_url(83-85)organization_id(101-106)organization_name(109-114)CloudWorkspace(65-681)airbyte/cloud/auth.py (3)
resolve_cloud_api_url(25-33)resolve_cloud_client_id(17-22)resolve_cloud_client_secret(9-14)airbyte/_util/api_util.py (1)
get_workspace(100-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/mcp/cloud_ops.py (1)
155-164: CloudWorkspaceResult shape looks consistent and expressiveThe switch to
workspace_id/workspace_nameplus optionalworkspace_url/organization_namelines up nicely with how other result models are structured; I don’t see any functional issues here, and the requiredorganization_idkeeps callers’ expectations clear. LGTM from my side.
9de76d2
into
main
Summary
Updates the
check_airbyte_cloud_workspaceMCP tool to return organization ID and name alongside workspace information. This helps users who don't know their organization ID.Changes:
get_workspace_organization_info()function inapi_util.pythat calls the Config API endpointPOST /v1/workspaces/get_organization_infofor an efficient O(1) lookupCloudWorkspaceResultmodel to includeorganization_namefieldcheck_airbyte_cloud_workspacereturn type fromstrtoCloudWorkspaceResultwith structured dataUpdates since last revision:
/v1/workspaces/get_organization_infoendpoint in the OpenAPI specCloudWorkspaceResultfields fromid/nametoworkspace_id/workspace_namefor clarity[error: organization ID not discovered]when organization ID lookup failsReview & Testing Checklist for Human
strtoCloudWorkspaceResult. Verify this doesn't break existing MCP tool consumers.POST /v1/workspaces/get_organization_infois used. Verify this endpoint is available in all target Airbyte Cloud environments.workspace_urlbut the new one doesn't. Verify this isn't needed by consumers.check_airbyte_cloud_workspaceMCP tool with valid credentials and verify it returns the expected organization info (organizationId and organizationName).Notes
Summary by CodeRabbit
New Features
Refactor
Performance
✏️ Tip: You can customize this high-level summary in your review settings.
Important
Auto-merge enabled.
This PR is set to merge automatically when all requirements are met.
Note
Auto-merge may have been disabled. Please check the PR status to confirm.