Skip to content

feat(mcp): Add AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED env vars#890

Merged
Aaron ("AJ") Steers (aaronsteers) merged 6 commits intomainfrom
devin/1764726550-mcp-domain-filtering
Dec 3, 2025
Merged

feat(mcp): Add AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED env vars#890
Aaron ("AJ") Steers (aaronsteers) merged 6 commits intomainfrom
devin/1764726550-mcp-domain-filtering

Conversation

@aaronsteers
Copy link
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Dec 3, 2025

Summary

Adds two new environment variables to control which MCP tool domains are advertised by the server:

  • AIRBYTE_MCP_DOMAINS: CSV of enabled domains (e.g., registry,cloud)
  • AIRBYTE_MCP_DOMAINS_DISABLED: CSV of disabled domains (e.g., registry)

Domain filtering logic:

AIRBYTE_MCP_DOMAINS AIRBYTE_MCP_DOMAINS_DISABLED Result
(not set) (not set) All domains enabled
registry,cloud (not set) Only registry and cloud enabled
(not set) registry All except registry enabled
registry,cloud registry Only cloud enabled (disabled subtracts from enabled)

Values are case-insensitive and whitespace is trimmed.

Updates since last revision

  • Added MCPToolDomain(str, Enum) as single source of truth for valid domains (cloud, local, registry)
  • Added warning when unknown domains are specified in env vars (shows list of valid domains)
  • Moved domain constants to dedicated airbyte/mcp/constants.py module for better pdoc documentation
  • Fixed case-insensitive handling in readonly mode check
  • Test count: 39 unit tests (added 2 for unknown domain warnings)

Review & Testing Checklist for Human

  • Verify domain filtering works end-to-end: Start the MCP server with AIRBYTE_MCP_DOMAINS=cloud and confirm only cloud tools are advertised (not registry or local tools)
  • Verify disabled list works: Start with AIRBYTE_MCP_DOMAINS_DISABLED=registry and confirm registry tools are hidden but cloud and local are available
  • Verify backward compatibility: Start with neither env var set and confirm all tools are still advertised (existing behavior preserved)
  • Verify interaction with readonly mode: The should_register_tool function was refactored - confirm that AIRBYTE_CLOUD_MCP_READONLY_MODE still works correctly for cloud tools
  • Verify unknown domain warning: Set AIRBYTE_MCP_DOMAINS=cloud,typo_domain and confirm a warning is printed listing valid domains

Recommended test plan:

# Test 1: Only cloud domain enabled
AIRBYTE_MCP_DOMAINS=cloud poetry run airbyte-mcp
# Verify: only cloud tools appear in tool list

# Test 2: Registry disabled
AIRBYTE_MCP_DOMAINS_DISABLED=registry poetry run airbyte-mcp
# Verify: cloud and local tools appear, registry tools hidden

# Test 3: No env vars (backward compat)
poetry run airbyte-mcp
# Verify: all tools appear

# Test 4: Unknown domain warning
AIRBYTE_MCP_DOMAINS=cloud,invalid_domain poetry run airbyte-mcp
# Verify: warning printed with valid domains list

Notes

  • Env vars are parsed at module import time (consistent with existing pattern for AIRBYTE_CLOUD_MCP_READONLY_MODE, etc.)
  • MCPToolDomain enum follows existing pattern from airbyte/strategies.py (WriteStrategy, WriteMethod)
  • The Literal["cloud", "local", "registry"] type hints in _tool_utils.py were intentionally left unchanged to avoid large refactoring scope

Link to Devin run: https://app.devin.ai/sessions/1dd27d375eba46dbaee62820a6d9e0da
Requested by: AJ Steers (Aaron ("AJ") Steers (@aaronsteers))

Summary by CodeRabbit

  • New Features

    • MCP tools can be scoped by domain via environment variables to enable/disable cloud, local, and registry domains.
    • Domain names are normalized (case-insensitive) and precedence is applied: explicit disables take priority over enables, affecting which tools register.
  • Tests

    • Added comprehensive unit tests covering domain filtering, env var parsing and normalization, precedence behavior, readonly interactions, and warnings for unknown domains.

✏️ Tip: You can customize this high-level summary in your review settings.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

…nv vars

Add environment variable options to control which MCP tool domains are advertised:

- AIRBYTE_MCP_DOMAINS: CSV of enabled domains (e.g., 'registry,cloud')
- AIRBYTE_MCP_DOMAINS_DISABLED: CSV of disabled domains (e.g., 'registry')

Domain filtering logic:
- If neither is set: all domains enabled
- If only AIRBYTE_MCP_DOMAINS is set: only those domains enabled
- If only AIRBYTE_MCP_DOMAINS_DISABLED is set: all except those enabled
- If both are set: disabled list subtracts from enabled list

Added comprehensive unit tests for all scenarios.

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Contributor

Original prompt from AJ Steers
Received message in Slack channel #ask-devin-ai:

@Devin - In the PyAirbyte/Coral MCP server, add a new env var option AIRBYTE_MCP_DOMAINS and its inverse AIRBYTE_MCP_DOMAINS_DISABLED. If either is set, they accept an inline csv specifying which tool domains should be advertised. If both are set (and the sets intersect), disabled domains subtract from declared ones. Otherwise, they operate independently.

# Scecnario A: Both registry and cloud (cloud_ops?) domains are enabled (only)
AIRBYTE_MCP_DOMAINS=registry,cloud

# Scenario B: Everything _except_ registry domain tools are enabled

AIRBYTE_MCP_DOMAINS_DISABLED=registry

# Scenario C: Intersection, disabled list takes precedence (only 'cloud' would be enabled)

AIRBYTE_MCP_DOMAINS=registry,cloud
AIRBYTE_MCP_DOMAINS_DISABLED=registry

# Scenario D : Non-intersection, disabled list has no effect (only cloud is enabled)

AIRBYTE_MCP_DOMAINS=cloud
AIRBYTE_MCP_DOMAINS_DISABLED=registry
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1764726455177679

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1764726550-mcp-domain-filtering' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1764726550-mcp-domain-filtering'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 3, 2025

📝 Walkthrough

Walkthrough

Adds environment-configurable MCP domains and domain-aware gating for MCP tool registration, including cached resolution, validation/warnings for unknown domains, a public is_domain_enabled(domain) helper, and unit tests covering parsing and registration behavior.

Changes

Cohort / File(s) Summary
Configuration Constants
airbyte/constants.py
Adds MCP_TOOL_DOMAINS and environment-parsed AIRBYTE_MCP_DOMAINS / AIRBYTE_MCP_DOMAINS_DISABLED (comma-split, trimmed, lowercased; None if empty). Documents that disabled-list takes precedence.
Domain Filtering Logic
airbyte/mcp/_tool_utils.py
Adds _resolve_mcp_domain_filters() (LRU-cached) to compute enabled/disabled domains and warn on unknown entries. Adds public is_domain_enabled(domain: str) -> bool with four-case logic and integrates domain-based gating into should_register_tool() before safe-mode and read-only checks; normalizes domain names.
Unit Tests
tests/unit_tests/test_mcp_tool_utils.py
New tests for is_domain_enabled() and should_register_tool() across enabled/disabled sets, case normalization, env var parsing, unknown-domain warnings, and runtime reload behavior using monkeypatch/importlib.reload.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Verify LRU cache behavior vs. tests that use importlib.reload or expect dynamic env changes.
  • Confirm normalization (lowercasing/trimming) and splitting correctly handles empty values and whitespace.
  • Ensure precedence logic: disabled domains override enabled domains in all code paths.
  • Check warning messages for clarity and that unknown-domain detection only triggers for truly unknown values.

Possibly related PRs

Suggested reviewers

  • aldogonzalez8

Would you like additional reviewers added for platform or testing coverage, wdyt?

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the primary change: introducing two new environment variables for MCP domain filtering configuration.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1764726550-mcp-domain-filtering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
airbyte/mcp/_tool_utils.py (2)

31-45: Nice, robust env parsing for domain lists

The CSV → lowercased set[str] parsing with whitespace trimming and empty filtering is clean and matches the tests. Since both AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED share the same parsing pattern, would it be worth extracting a tiny helper like _parse_domain_env(var_name: str) -> set[str] to reduce duplication and keep future tweaks (e.g., logging invalid entries) in one place, wdyt?


85-116: is_domain_enabled logic looks sound and well-covered

The four-way logic (no envs → allow all; only enabled; only disabled; both with disabled subtracting) reads clearly and matches the docstring and tests. The case-insensitive handling via domain.lower() and lowercased sets is especially nice for operator UX. Since this function is now part of the public surface, would adding a short note in the docstring about case-insensitivity help future readers understand that behavior at a glance, wdyt?

tests/unit_tests/test_mcp_tool_utils.py (3)

11-155: Parametrized coverage for is_domain_enabled looks great

The matrix here nicely exercises all combinations of enabled/disabled sets plus a couple of case-insensitivity scenarios, which aligns well with the implementation. To fully mirror the implementation behavior, would you consider adding a case where both lists are set and the domain appears in neither (e.g., enabled {"cloud"}, disabled {"registry"}, domain "other", expecting False) just to make that branch explicit in tests, wdyt?


158-295: Add a case for uppercase tool_domain under readonly mode?

These tests do a nice job validating the interaction between domain filters and AIRBYTE_CLOUD_MCP_READONLY_MODE. Given that is_domain_enabled is case-insensitive but should_register_tool currently checks domain == "cloud" for readonly, would it be worth adding a case like:

  • enabled_domains={"cloud"}, tool_domain="CLOUD", readonly_mode=True, is_readonly=False, expected=False

to codify the intended behavior and back up the suggested change in should_register_tool, wdyt?


346-380: Consider parity tests for AIRBYTE_MCP_DOMAINS_DISABLED parsing

The disabled-domains parsing tests validate basic cases (empty, single, multiple), which is solid. Since the implementation shares the same parsing logic as AIRBYTE_MCP_DOMAINS, would you be up for adding one or two extra cases mirroring the main env var tests (e.g., uppercase normalization and empty entries) to ensure behavior stays in sync if parsing ever evolves, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf60864 and eb36226.

📒 Files selected for processing (2)
  • airbyte/mcp/_tool_utils.py (3 hunks)
  • tests/unit_tests/test_mcp_tool_utils.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit_tests/test_mcp_tool_utils.py (1)
airbyte/mcp/_tool_utils.py (2)
  • is_domain_enabled (85-115)
  • should_register_tool (118-139)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
tests/unit_tests/test_mcp_tool_utils.py (1)

298-343: Env parsing tests for AIRBYTE_MCP_DOMAINS are thorough

Nice use of patch.dict + importlib.reload to validate parsing behavior without polluting global state. The cases cover empties, spacing, multiple values, uppercase normalization, and empty entries, which lines up with the comprehension in _tool_utils. Since you already reload the module after exiting the patched env, this should also keep the rest of the suite clean. Nothing blocking here from my side.

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

PyTest Results (Fast Tests Only, No Creds)

348 tests  +28   348 ✅ +28   5m 51s ⏱️ -37s
  1 suites ± 0     0 💤 ± 0 
  1 files   ± 0     0 ❌ ± 0 

Results for commit c995a02. ± Comparison against base commit cf60864.

♻️ This comment has been updated with latest results.

…-insensitive handling

- Move AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED to airbyte/mcp/constants.py for better documentation in pdoc docs
- Fix case-insensitive readonly mode check in should_register_tool
- Standardize test imports to use module import style consistently
- Update tests to reload constants module for env var parsing tests

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/unit_tests/test_mcp_tool_utils.py (2)

333-344: Consider wrapping the cleanup reload in a try/finally for test isolation, wdyt?

If the assertion on line 341 fails, the cleanup importlib.reload(constants) on line 343 won't execute, potentially leaving the constants module in a modified state for subsequent tests.

 def test_domain_env_var_parsing(env_value: str, expected_set: set[str]) -> None:
     """Test that AIRBYTE_MCP_DOMAINS env var is parsed correctly."""
     import importlib
 
     import airbyte.mcp.constants as constants
 
-    with patch.dict("os.environ", {"AIRBYTE_MCP_DOMAINS": env_value}, clear=False):
-        importlib.reload(constants)
-        assert constants.AIRBYTE_MCP_DOMAINS == expected_set
-
-    importlib.reload(constants)
+    try:
+        with patch.dict("os.environ", {"AIRBYTE_MCP_DOMAINS": env_value}, clear=False):
+            importlib.reload(constants)
+            assert constants.AIRBYTE_MCP_DOMAINS == expected_set
+    finally:
+        importlib.reload(constants)

366-380: Same try/finally pattern could apply here for consistency.

Same cleanup concern as the other env var parsing test—if you decide to apply the try/finally pattern above, might as well keep this one consistent, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb36226 and 48a2fd1.

📒 Files selected for processing (3)
  • airbyte/mcp/_tool_utils.py (4 hunks)
  • airbyte/mcp/constants.py (1 hunks)
  • tests/unit_tests/test_mcp_tool_utils.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/mcp/_tool_utils.py
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: yohannj
Repo: airbytehq/PyAirbyte PR: 716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
🧬 Code graph analysis (1)
tests/unit_tests/test_mcp_tool_utils.py (1)
airbyte/mcp/_tool_utils.py (2)
  • is_domain_enabled (74-104)
  • should_register_tool (107-129)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (4)
tests/unit_tests/test_mcp_tool_utils.py (2)

11-156: Nice comprehensive test coverage for is_domain_enabled!

The parametrized test cases cover all the documented scenarios well—empty sets, enabled-only, disabled-only, both set, and case insensitivity. The patching approach is correct, targeting the constants in the module where they're used.


158-296: Good coverage of the domain + readonly mode interaction!

I particularly like the test case at line 251-259 (domain_filter_blocks_even_readonly_cloud_tool) which validates that domain filtering takes precedence—this matches the implementation logic in should_register_tool where domain check comes first.

airbyte/mcp/constants.py (2)

14-21: Clean parsing logic with good handling of edge cases!

The set comprehension correctly handles comma-separated values with whitespace trimming and lowercase normalization. The empty value filtering with if d.strip() ensures spurious empty strings from "a,,b" are excluded.


22-51: Excellent documentation!

The docstrings clearly explain the behavior, precedence rules, and include practical examples. The note about disabled list taking precedence (subtracting from enabled) aligns with the implementation in is_domain_enabled.

- Add MCPToolDomain(str, Enum) as single source of truth for valid domains
- Derive MCP_TOOL_DOMAINS set from enum values
- Add warning when unknown domains are specified in env vars
- Add tests for unknown domain warning functionality

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/unit_tests/test_mcp_tool_utils.py (2)

346-381: *Consider mirroring the richer parsing cases for _DISABLED

Right now test_domain_disabled_env_var_parsing only covers empty/single/multi, whereas the enabled env var tests also cover spaces, uppercase normalization, and empty entries. Would it be worth adding the same extra cases here for symmetry and future‑proofing, so both env vars are validated against the same edge cases, wdyt?


383-425: Unknown-domain warning tests are clear; maybe add a combined case

The warning assertions are nicely targeted (specific substring + “Known MCP domains are”) and resilient to extra warnings. If you ever want to tighten coverage further, would you consider adding a parametrized case where both enabled and disabled env vars contain unknown domains, to exercise the branch that concatenates both messages into _parts, wdyt?

airbyte/mcp/constants.py (1)

15-89: Domain constants and env parsing match the documented behavior

The enum + derived MCP_TOOL_DOMAINS set, lowercased/trimmed parsing, and unknown-domain validation all look consistent with the described semantics. One small ergonomics thought: would bumping stacklevel in warnings.warn (e.g., to 2) so that the warning points at the import site (or higher-level caller) instead of inside constants.py make debugging misconfigured env vars a bit friendlier, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 48a2fd1 and 75e3a17.

📒 Files selected for processing (2)
  • airbyte/mcp/constants.py (1 hunks)
  • tests/unit_tests/test_mcp_tool_utils.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
tests/unit_tests/test_mcp_tool_utils.py (3)

11-155: is_domain_enabled test matrix looks solid

The parametrization here does a nice job of encoding the spec (allowlist/denylist semantics + case-insensitivity) in a readable way, and the direct patching of the imported constants keeps things focused on behavior instead of env parsing. I don’t see any gaps relative to the described rules.


158-295: Good coverage of domain filtering vs readonly mode

The matrix for test_should_register_tool nicely exercises the interaction between domain filters and AIRBYTE_CLOUD_MCP_READONLY_MODE, including the “domain filter wins over readonly” case. This should protect against regressions in the ordering of checks inside should_register_tool.


298-343: Env parsing tests for AIRBYTE_MCP_DOMAINS are thorough

The parsing cases (empty, single, multi, spaces, uppercase, double commas) map cleanly to the implementation and confirm the normalization behavior; reloading the constants module in and out of the patched env block also avoids leaking state across tests. Looks good to me.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title feat(mcp): Add AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED env vars feat(mcp): Add AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED env vars Dec 3, 2025
…u_cache helper

- Move MCP domain constants from airbyte/mcp/constants.py to airbyte/constants.py
- Store env var values as list[str] | None instead of sets
- Add _resolve_mcp_domain_filters() helper with @lru_cache to avoid warning multiple times
- Delete airbyte/mcp/constants.py (no bifurcation)
- Update tests to work with new structure

Addresses PR feedback from AJ.

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte/constants.py (2)

241-249: MCP_TOOL_DOMAINS is clear; consider deriving from the Enum to avoid drift?

The literal list of domains looks correct and matches the documented values. Since the PR also introduces an MCPToolDomain enum as the single source of truth, would it make sense to derive MCP_TOOL_DOMAINS from that enum (e.g., [d.value for d in MCPToolDomain]) if module layering allows, so the set of valid domains can’t silently diverge over time, wdyt?


251-281: Env parsing logic is solid; consider a tiny CSV parsing helper to DRY this up?

The normalization and or None behavior for both AIRBYTE_MCP_DOMAINS and AIRBYTE_MCP_DOMAINS_DISABLED look correct and handle empty/whitespace values gracefully. Since the parsing pattern is duplicated and might be reused for future MCP-related env vars, would extracting a small helper improve readability and reduce chances of subtle divergence, wdyt?

For example, you could introduce something like:

def _parse_csv_env(name: str) -> list[str] | None:
    raw = os.getenv(name, "").strip()
    if not raw:
        return None
    values = [d.strip().lower() for d in raw.split(",") if d.strip()]
    return values or None

and then:

AIRBYTE_MCP_DOMAINS: list[str] | None = _parse_csv_env("AIRBYTE_MCP_DOMAINS")
AIRBYTE_MCP_DOMAINS_DISABLED: list[str] | None = _parse_csv_env("AIRBYTE_MCP_DOMAINS_DISABLED")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75e3a17 and 27b15aa.

📒 Files selected for processing (3)
  • airbyte/constants.py (1 hunks)
  • airbyte/mcp/_tool_utils.py (4 hunks)
  • tests/unit_tests/test_mcp_tool_utils.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte/mcp/_tool_utils.py
  • tests/unit_tests/test_mcp_tool_utils.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

PyTest Results (Full)

416 tests  +28   400 ✅ +28   25m 10s ⏱️ - 1m 2s
  1 suites ± 0    16 💤 ± 0 
  1 files   ± 0     0 ❌ ± 0 

Results for commit c995a02. ± Comparison against base commit cf60864.

♻️ This comment has been updated with latest results.

Combine intermediate variables into single expressions while preserving
whitespace trimming and empty value filtering.

Co-Authored-By: AJ Steers <aj@airbyte.io>
- Reduce from 450 lines to 104 lines (77% reduction)
- Move all imports to top of file (no inline imports)
- Combine domain logic tests into single parametrized test
- Use compact tuple tables instead of verbose pytest.param blocks

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/unit_tests/test_mcp_tool_utils.py (1)

69-121: Consider DRY‑ing the env‑parsing patch/reload pattern

_ENV_PARSE_CASES and test_env_parsing correctly validate normalization, splitting, and empty‑value handling by reloading airbyte.constants under different os.environ values. To keep this test lean and easier to tweak later, would it be worth extracting the patch.dict(...) + importlib.reload(constants) / reset‑reload pattern into a small helper or pytest fixture so the body of the test focuses purely on the cases table and assertions, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e95c4cf and c995a02.

📒 Files selected for processing (1)
  • tests/unit_tests/test_mcp_tool_utils.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit_tests/test_mcp_tool_utils.py (1)
airbyte/mcp/_tool_utils.py (3)
  • _resolve_mcp_domain_filters (78-110)
  • is_domain_enabled (113-144)
  • should_register_tool (147-169)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (2)
tests/unit_tests/test_mcp_tool_utils.py (2)

16-67: Domain/readonly matrix covers behavior thoroughly

The _DOMAIN_CASES matrix plus test_domain_logic do a nice job exercising all the key combinations of enabled/disabled domains and readonly mode, and clearing the _resolve_mcp_domain_filters cache inside the patched context keeps each parametrized case isolated. I don’t see any functional gaps here.


123-153: Warning tests correctly exercise unknown‑domain messaging

The _WARNING_CASES data plus test_unknown_domain_warning validate both the specific “unknown domain(s)” fragment and the “Known MCP domains are:” suffix, and the combination of importlib.reload(constants), importlib.reload(tool_utils), and _resolve_mcp_domain_filters.cache_clear() ensures the LRU cache and module‑level constants reflect the patched environment for each case. This looks solid to me.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit 2ccbeb2 into main Dec 3, 2025
23 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1764726550-mcp-domain-filtering branch December 3, 2025 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant