Skip to content

feat(mcp): Add definition_id to source results and org-level search tool#907

Closed
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin/1765422640-add-definition-id-to-mcp-tools
Closed

feat(mcp): Add definition_id to source results and org-level search tool#907
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin/1765422640-add-definition-id-to-mcp-tools

Conversation

@devin-ai-integration
Copy link
Contributor

feat(mcp): Add definition_id to source results and org-level search tool

Summary

This PR adds the ability to identify sources by their connector type (definition_id) rather than just by name. It includes:

  1. New definition_id property on CloudSource - Exposes the connector definition ID (source type) from the underlying API response
  2. Updated CloudSourceResult - Now includes definition_id field in the response
  3. Updated list_deployed_cloud_source_connectors - Returns definition_id for each source
  4. New list_sources_by_definition_in_organization tool - Searches all workspaces in an organization for sources matching a specific connector type

This enables use cases like "find all YouTube Analytics sources in an organization" regardless of what the sources are named.

Review & Testing Checklist for Human

  • Verify definition_id is correctly populated - The property accesses _connector_info.definition_id which should be available from the Airbyte API's SourceResponse. Confirm this field exists and is populated correctly.
  • Test the new org-level tool with a real organization - Run list_sources_by_definition_in_organization against an org with known sources to verify it finds them correctly
  • Check performance for large organizations - The tool iterates workspaces sequentially; verify this is acceptable for orgs with many workspaces (77+ in the original use case)
  • Verify backward compatibility - Ensure existing MCP tool consumers handle the new definition_id field gracefully

Recommended Test Plan

  1. Use the MCP CLI to call list_deployed_cloud_source_connectors and verify definition_id appears in results
  2. Call list_sources_by_definition_in_organization with a known connector definition ID (e.g., afa734e4-3571-11ec-991a-1e0031268139 for YouTube Analytics) and verify it finds expected sources
  3. Verify the tool returns empty results gracefully when no matches are found

Notes

  • No unit tests were added - consider adding tests for the new CloudSource.definition_id property and the org-level tool
  • The org-level tool may be slow for large organizations as it queries each workspace individually
  • Lint checks pass (poe fix-and-check)

Link to Devin run: https://app.devin.ai/sessions/d8212a7bceaf4995a63369460134dc03
Requested by: aldo.gonzalez@airbyte.io

- Add definition_id property to CloudSource class
- Add definition_id field to CloudSourceResult model
- Update list_deployed_cloud_source_connectors to include definition_id
- Add new list_sources_by_definition_in_organization tool for searching
  sources by connector type across all workspaces in an organization

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
@devin-ai-integration
Copy link
Contributor Author

Original prompt from aldo.gonzalez@airbyte.io
Received message in Slack channel #ask-devin-ai:

@Devin can you check all the workspaces in org c48aad53-0882-4012-874c-34cc4f439f93, I want to find all the connections that have Youtube Analytics from as sources, bring me those connections links please.
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1765414896273579

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765422640-add-definition-id-to-mcp-tools' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765422640-add-definition-id-to-mcp-tools'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link

PyTest Results (Fast Tests Only, No Creds)

348 tests  ±0   348 ✅ ±0   5m 59s ⏱️ -7s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit ece066f. ± Comparison against base commit 441105e.

Comment on lines +1598 to +1603
return OrgSourcesByDefinitionResult(
organization_id=resolved_org_id,
definition_id=definition_id,
total_sources_found=total_sources_found,
workspaces_with_matches=workspaces_with_matches,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something feels odd the name is list_sources_by_definition_in_organization but we return workspaces_with_matches=workspaces_with_matches, meh, me no gustar. Do you have different ideas for this?

@github-actions
Copy link

PyTest Results (Full)

416 tests  ±0   398 ✅  - 1   24m 26s ⏱️ -2s
  1 suites ±0    17 💤 ±0 
  1 files   ±0     1 ❌ +1 

For more details on these failures, see this check.

Results for commit ece066f. ± Comparison against base commit 441105e.

return self.connector_id

@property
def definition_id(self) -> str:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aldo Gonzalez (@aldogonzalez8) (and Devin I guess): Can we call this connector_definition_id? On its own "definition ID" could be misunderstood to mean something about the actor (instance) ID. I'm curious also if we could get something similar working with the canonical ID (source-something rather than require the definition ID to be exposed for users.

One more thing to think about: this might have a better home in the new Internal-Ops. It looks like you and I are maybe thinking similarly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or source_definition_id/destination_definition_id? I think I have some references already to this concept in the custom connections interfaces.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestions! I can make these changes if Aldo Gonzalez (@aldogonzalez8) agrees:

  1. Rename to connector_definition_id - Makes it clearer this refers to the connector type, not the actor instance

  2. Support canonical ID - Could add a connector_type parameter (e.g., source-youtube-analytics) as an alternative to the UUID. The tool would resolve it to the definition ID internally.

Waiting for direction on whether to proceed with these changes here or move this to Internal-Ops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source_definition_id makes sense since this is on the CloudSource class - it's more specific than the generic connector_definition_id. I can rename it once there's consensus on the preferred naming.

@devin-ai-integration
Copy link
Contributor Author

Closing this PR as we've decided to move this functionality to Internal-Ops per discussion with Aaron ("AJ") Steers (@aaronsteers) and Aldo Gonzalez (@aldogonzalez8). Will open a new PR there with the updated design (connection-centric, connector_definition_id naming, canonical name support).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants