fix: Skip local validation when deploying connectors to Cloud via MCP#851
Conversation
- Pass config_spec_jsonschema=None to resolve_config() to skip hardcoded secrets validation - Call set_config(validate=False) to avoid triggering local connector execution - This prevents installation errors when deploying to Cloud without local Python/Docker Fixes #850 Co-Authored-By: AJ Steers <aj@airbyte.io>
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1762289840-fix-mcp-cloud-destination-install' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1762289840-fix-mcp-cloud-destination-install'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
|
Warning Rate limit exceededdevin-ai-integration[bot] has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 12 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdd a NoOp executor path and registry-spec utilities; get_source/get_destination gain a no_executor flag; cloud deploys request server-side validation via set_config(..., validate=True); connector listing defaults to Docker-installable connectors. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant SourcesUtil as get_source()
participant ExecUtil as get_connector_executor()
participant Registry as ConnectorRegistry
participant NoOp as NoOpExecutor
Caller->>SourcesUtil: get_source(name, version, no_executor=True)
SourcesUtil->>ExecUtil: get_connector_executor(name, version, no_executor=True)
ExecUtil->>Registry: get_connector_metadata(name)
Registry-->>ExecUtil: metadata
ExecUtil->>NoOp: instantiate NoOpExecutor(name, metadata, target_version)
NoOp-->>ExecUtil: executor
ExecUtil-->>SourcesUtil: executor
SourcesUtil-->>Caller: Source(executor=NoOp)
sequenceDiagram
participant User
participant CloudOps
participant Registry
participant CloudAPI
User->>CloudOps: deploy_source_to_cloud(name, config)
CloudOps->>Registry: get_connector_metadata(name)
Registry-->>CloudOps: metadata
CloudOps->>CloudAPI: set_config(config, validate=true, executor_noop=true)
CloudAPI-->>CloudOps: deploy result
CloudOps-->>User: status
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/mcp/connector_registry.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/mcp/connector_registry.py (1)
airbyte/sources/registry.py (1)
get_available_connectors(225-281)
🪛 GitHub Actions: Run Linters
airbyte/mcp/connector_registry.py
[error] 74-74: Undefined name 'InstalledType' (F821) in type annotation. Referenced as InstalledType.DOCKER.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte/mcp/connector_registry.py (1)
73-74: Verify: Should we list all connectors regardless of local Docker availability?This changes the default behavior from environment-aware filtering to always returning all Docker connectors. Looking at the
get_available_connectors()implementation, wheninstall_type=None, it intelligently checks whether Docker is installed and filters accordingly. Withinstall_type=InstallType.DOCKER, it always returns all connectors.In environments where Docker is not available, this will now list connectors that cannot actually be installed or run locally. Is this intentional for Cloud-focused workflows, or should we preserve the environment-aware behavior? Wdyt?
If the intent is to make this Cloud-focused (where Docker is always available), consider updating the docstring to clarify that the list represents connectors available for Cloud deployment rather than local execution.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…al installation - Implement NoOpExecutor that fetches connector specs from registry - Add helper functions get_connector_spec_from_registry() and validate_connector_config_from_registry() - Add no_executor parameter to get_source() and get_destination() - Update MCP cloud deploy tools to use NoOpExecutor by default - Enables pre-validation of connector configs without Docker/Python installation - Fixes connector_registry.py typo: InstalledType -> InstallType Closes #850 Co-Authored-By: AJ Steers <aj@airbyte.io>
…ttps://git-manager.devin.ai/proxy/github.com/airbytehq/PyAirbyte into devin/1762289840-fix-mcp-cloud-destination-install
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
airbyte/mcp/connector_registry.py (1)
17-17: Import fix looks good!This correctly addresses the undefined name error flagged in the previous review.
🧹 Nitpick comments (4)
airbyte/sources/util.py (1)
121-140: Same concerns as destinations/util.py apply here.The conditional logic mirrors the destination implementation (lines 79-97 in
airbyte/destinations/util.py). The same verification request about None metadata handling and the refactoring suggestion to extract common logic apply here as well.airbyte/_util/registry_spec.py (2)
19-19: Consider making the registry URL configurable.The spec URL is currently hardcoded to
connectors.airbyte.com. I noticed in the error messages fromairbyte/_executors/util.pythere's mention of a_REGISTRY_ENV_VARfor custom registries. Should this URL template also respect custom registry configuration for consistency? Wdyt?
27-27: Consider documenting the platform parameter more thoroughly.The
platformparameter accepts"cloud"or"oss", but there's no explanation of when to use which. I see thatNoOpExecutortries "cloud" first then falls back to "oss" (lines 78-88 inairbyte/_executors/noop.py). It might help users to document:
- What's the difference between cloud and OSS specs?
- When should callers prefer one over the other?
Wdyt about expanding the docstring?
airbyte/_executors/noop.py (1)
76-88: Consider making platform preference configurable.The executor currently tries "cloud" first, then falls back to "oss". This makes sense for Cloud deployments, but some users might want to prefer OSS specs. Would it make sense to add an optional
platform_preferenceparameter to theNoOpExecutorconstructor that could be passed through fromget_source()/get_destination()? This would be consistent with howregistry_spec.pyalready accepts aplatformparameter. Wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
airbyte/_executors/noop.py(1 hunks)airbyte/_util/registry_spec.py(1 hunks)airbyte/destinations/util.py(4 hunks)airbyte/mcp/cloud_ops.py(2 hunks)airbyte/mcp/connector_registry.py(2 hunks)airbyte/sources/util.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte/mcp/cloud_ops.py
🧰 Additional context used
🧬 Code graph analysis (5)
airbyte/destinations/util.py (3)
airbyte/_executors/noop.py (1)
NoOpExecutor(26-131)airbyte/_executors/util.py (1)
get_connector_executor(158-371)airbyte/sources/registry.py (1)
get_connector_metadata(195-222)
airbyte/sources/util.py (3)
airbyte/_executors/noop.py (1)
NoOpExecutor(26-131)airbyte/_executors/util.py (1)
get_connector_executor(158-371)airbyte/sources/registry.py (1)
get_connector_metadata(195-222)
airbyte/_executors/noop.py (4)
airbyte/_executors/base.py (1)
Executor(159-248)airbyte/_util/registry_spec.py (1)
get_connector_spec_from_registry(23-98)airbyte/sources/registry.py (1)
ConnectorMetadata(58-89)airbyte/exceptions.py (1)
AirbyteConnectorExecutableNotFoundError(340-341)
airbyte/mcp/connector_registry.py (1)
airbyte/sources/registry.py (2)
InstallType(41-47)get_available_connectors(225-281)
airbyte/_util/registry_spec.py (3)
airbyte/sources/registry.py (1)
get_connector_metadata(195-222)airbyte/version.py (1)
get_version(12-14)airbyte/exceptions.py (1)
AirbyteConnectorNotRegisteredError(288-297)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (9)
airbyte/destinations/util.py (3)
11-14: Imports look good!The new imports are necessary for the no_executor pathway and are properly used below.
36-36: Parameter addition looks good!The
no_executorparameter with a default ofFalsemaintains backward compatibility while enabling the new Cloud deployment pathway.
79-97: Code duplication in executor creation can be reduced with a helper function.The None metadata handling is actually fine—
get_connector_spec_from_registry()explicitly handlesversion=Noneby using the latest available version from metadata. However, you've got identical executor creation logic in bothget_destination()andget_source()(lines 79-97 in destinations/util.py and lines 121-140 in sources/util.py).Would extracting a
_create_noop_executor()helper function make sense here? It could acceptnameandversionparameters and return the executor, reducing duplication and making future updates easier to maintain consistently across both functions. Wdyt?airbyte/sources/util.py (1)
10-14: Consistent implementation with destinations!The imports and structure mirror the changes in
airbyte/destinations/util.py, which is great for consistency.airbyte/_util/registry_spec.py (2)
23-98: Excellent error handling!The function gracefully handles multiple failure scenarios (timeouts, request failures, missing specs) by logging warnings and returning
None. This allows callers likeNoOpExecutorto decide how to handle unavailable specs without crashing. The logging provides good debugging context too.
101-141: Clean validation implementation!The function properly reuses
get_connector_spec_from_registryand provides clear return values. The tuple return type(bool, str | None)makes it easy for callers to check validity and get error details when needed.airbyte/_executors/noop.py (3)
26-53: Well-designed class structure!The
NoOpExecutorproperly extends theExecutorbase class and maintains a cached spec to minimize registry requests. The docstring clearly explains the use case and limitations.
90-96: Good graceful degradation!When the spec can't be fetched, logging a warning and yielding nothing (instead of crashing) allows Cloud deployments to proceed even if the registry is temporarily unavailable. The warning provides debugging context without blocking the deployment flow.
120-131: No-op installation methods are appropriate!Since
NoOpExecutordoesn't require local installation (the whole point!), these no-op implementations ofensure_installation(),install(), anduninstall()are exactly right. The docstrings clearly explain why they're no-ops.
…d tests - Refactored get_connector_executor() to handle no_executor parameter directly - Removed NoOpExecutor branching logic from get_source() and get_destination() - Both helpers now pass no_executor through to get_connector_executor() - Fixed NoOpExecutor error guidance to reference correct parameter (no_executor) - Added integration tests for registry_spec.py helper functions - Uses local import in get_connector_executor() to avoid import cycles Addresses PR feedback from @aaronsteers on PR #851 Co-Authored-By: AJ Steers <aj@airbyte.io>
- Changed invalid test case from using unknown fields to type/boundary violations - Added test for type violation: count as string instead of integer - Added test for boundary violation: count = 0 when minimum is 1 - These test actual JSON schema validation rather than relying on required fields Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte/_executors/noop.py (1)
120-123: Remove the unnecessary pass statement.As noted in the previous review, line 123's
passis redundant since line 122 (_ = auto_fix # Unused) already keeps the method body non-empty. Wdyt about removing it for consistency?Apply this diff:
def ensure_installation(self, *, auto_fix: bool = True) -> None: """No-op: NoOpExecutor doesn't require installation.""" _ = auto_fix # Unused - pass
🧹 Nitpick comments (1)
airbyte/_executors/noop.py (1)
76-107: Solid spec command implementation with good fallback logic.The cloud-first fallback to OSS is a smart approach, and the caching prevents redundant fetches. The SPEC message structure matches the expected Airbyte protocol format.
Minor nitpick: Line 95 uses
yield from []which is equivalent to justreturnon line 96. Wdyt about simplifying to just the return statement?If you'd like to simplify, apply this diff:
if spec is None: logger.warning( f"Could not fetch spec for connector '{self.name}' from registry. " f"Pre-validation will not be available." ) - yield from [] return
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte/_executors/noop.py(1 hunks)airbyte/_executors/util.py(1 hunks)airbyte/destinations/util.py(2 hunks)airbyte/sources/util.py(2 hunks)tests/integration_tests/test_registry_spec.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-08T15:34:31.026Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Applied to files:
tests/integration_tests/test_registry_spec.py
🧬 Code graph analysis (5)
airbyte/destinations/util.py (2)
airbyte/_executors/util.py (1)
get_connector_executor(158-384)airbyte/destinations/base.py (1)
Destination(38-300)
airbyte/sources/util.py (2)
airbyte/_executors/util.py (1)
get_connector_executor(158-384)airbyte/sources/base.py (1)
Source(67-972)
airbyte/_executors/noop.py (5)
airbyte/_executors/base.py (1)
Executor(159-248)airbyte/_util/registry_spec.py (1)
get_connector_spec_from_registry(23-98)airbyte/_message_iterators.py (1)
AirbyteMessageIterator(61-207)airbyte/sources/registry.py (1)
ConnectorMetadata(58-89)airbyte/exceptions.py (1)
AirbyteConnectorExecutableNotFoundError(340-341)
airbyte/_executors/util.py (3)
airbyte/_executors/base.py (1)
Executor(159-248)airbyte/_executors/noop.py (1)
NoOpExecutor(26-131)airbyte/sources/registry.py (1)
get_connector_metadata(195-222)
tests/integration_tests/test_registry_spec.py (1)
airbyte/_util/registry_spec.py (2)
get_connector_spec_from_registry(23-98)validate_connector_config_from_registry(101-141)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (7)
tests/integration_tests/test_registry_spec.py (2)
14-32: LGTM! Clean integration test for registry spec fetching.The parameterization covers relevant connector types and platforms, and the assertions appropriately verify the basic spec structure.
35-74: Verify that the invalid config case reliably triggers validation errors.This is an integration test that fetches live specs from Airbyte's registry. The concern remains valid:
{"invalid_field": "value"}will only fail validation if source-faker's spec explicitly sets"additionalProperties": false. Since JSON Schema defaults to allowing additional properties, this test might unexpectedly pass if the spec permits them.Rather than relying on an unexpected field, wdyt about using a more definitive invalid config—one that violates the schema itself? For example:
- Providing a wrong type:
{"count": "not-a-number"}instead of an integer- Missing a required field (if one exists)
This would make the test resilient regardless of the spec's additionalProperties setting. Would that be worth exploring?
airbyte/destinations/util.py (1)
34-95: LGTM! Clean implementation of no_executor parameter.The changes appropriately delegate executor creation to
get_connector_executorwith theno_executorflag, addressing the previous review feedback. The docstring clearly explains the use case for this parameter.airbyte/sources/util.py (1)
62-139: LGTM! Consistent implementation with destinations.The changes mirror the pattern in
destinations/util.pyand properly delegate executor creation toget_connector_executor, addressing the previous review feedback. The implementation is clean and well-documented.airbyte/_executors/noop.py (2)
40-61: LGTM! Clean initialization and placeholder CLI.The constructor properly initializes the spec cache, and the
_cliplaceholder with its explanatory comment is appropriate given thatexecute()is overridden.
109-118: LGTM! Error handling with helpful guidance.The error message correctly references
no_executor=Trueand provides clear guidance for users who attempt unsupported operations. This addresses the previous review comment about the parameter reference.airbyte/_executors/util.py (1)
176-186: Add error handling to the no_executor path for consistency with the main flowGood catch! The code shows an inconsistency: the
no_executorpath (line 181) callsget_connector_metadata(name)without error handling, while the main execution path (lines 230-235) wraps the same call in a try-except block to handleAirbyteConnectorNotRegisteredError.If a non-existent connector is requested with
no_executor=True, the exception will propagate uncaught, whereas in the normal path it's caught, logged, and re-raised with context. For consistency and better error diagnostics, should we add similar error handling to theno_executorblock? This would ensure all code paths handle registry failures uniformly, wdyt?
…stall method - Moved NoOpExecutor import to top of util.py (no import cycles detected) - Included no_executor in install_method_count validation - Moved no_executor handling to be inline with other install methods - Updated error message to include no_executor in the list - Verified functionality works correctly with local tests Addresses PR feedback from @aaronsteers Co-Authored-By: AJ Steers <aj@airbyte.io>
- Added test cases that specify explicit version (6.2.0) instead of defaulting to latest - Tests both OSS and Cloud platforms with versioned specs - Addresses PR feedback from @aaronsteers Co-Authored-By: AJ Steers <aj@airbyte.io>
d8fd0be
into
main
fix: Skip local validation when deploying connectors to Cloud via MCP
Summary
Fixes #850 - MCP was trying to install connectors locally before deploying them to Cloud, which failed when Python/Docker wasn't available or the connector couldn't be installed.
The root cause:
deploy_source_to_cloud()anddeploy_destination_to_cloud()were calling.config_specwhich triggers local connector execution via_execute(), requiring the connector to be installed locally.The fix skips local validation entirely by:
config_spec_jsonschema=Nonetoresolve_config()- skips hardcoded secrets detectionvalidate=Falsetoset_config()- skips config schema validationSince
CloudWorkspace.deploy_source/destination()only accesses.nameand._hydrated_config, no local execution is needed. All validation now happens server-side via the Cloud API.Review & Testing Checklist for Human
Test Plan
deploy_source_to_cloudanddeploy_destination_to_cloudMCP tools to deploy connectorsNotes
install_if_missing=Falseparameter was already present but insufficient because.config_specstill triggered local executioninstall_if_missingthrough all executors (too broad for this fix)Link to Devin run: https://app.devin.ai/sessions/d1818e9eff6f4b0ebd7b0cb0b9b85865
Requested by: AJ Steers (Aaron ("AJ") Steers (@aaronsteers))
Summary by CodeRabbit
New Features
Bug Fixes
Changes
Tests