Skip to content

feat(mcp): Add list_cloud_sync_jobs tool with pagination support#902

Merged
Aaron ("AJ") Steers (aaronsteers) merged 6 commits intomainfrom
devin/1765393451-add-job-pagination
Dec 10, 2025
Merged

feat(mcp): Add list_cloud_sync_jobs tool with pagination support#902
Aaron ("AJ") Steers (aaronsteers) merged 6 commits intomainfrom
devin/1765393451-add-job-pagination

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Dec 10, 2025

Summary

Adds a new MCP tool list_cloud_sync_jobs that enables pagination through sync job history for a connection. This addresses the issue where Devin could only retrieve the "latest" job and couldn't browse historical syncs.

Changes:

  • Added offset and order_by parameters to api_util.get_job_logs()
  • Enhanced CloudConnection.get_previous_sync_logs() with full pagination support (limit, offset, from_tail)
  • Added JOB_ORDER_BY_CREATED_AT_DESC and JOB_ORDER_BY_CREATED_AT_ASC constants to api_util.py
  • Added new MCP tool list_cloud_sync_jobs with parameters:
    • max_jobs: Maximum jobs to return (default 20, capped at 500)
    • jobs_offset: Number of jobs to skip
    • from_tail: Order newest-first (True) or oldest-first (False)
  • Added SyncJobResult and SyncJobListResult Pydantic models

The design follows the pattern established in PR #888 for log pagination.

Updates Since Last Revision

  • Consolidated all pagination functionality into get_previous_sync_logs() per code review feedback
  • Removed the separate list_sync_jobs() method to avoid having two ways of doing the same thing
  • Updated docstring to clarify that the method returns SyncResult objects (job metadata), not actual log text
  • Capped max_jobs at 500 to avoid overloading agent context (per AJ's feedback)
  • Removed the "0 = no limit" option; if max_jobs <= 0, it now defaults to 20
  • Note: Default limit changed from 10 to 20 to align with MCP tool defaults

Review & Testing Checklist for Human

  • Verify default limit change: get_previous_sync_logs() default limit changed from 10 to 20. Confirm this is acceptable for existing callers (e.g., get_sync_result() still passes limit=1 explicitly, so it's unaffected).
  • Verify explicit ordering: The method now explicitly orders by createdAt|DESC by default. Confirm this matches the previous implicit API behavior.
  • Verify max_jobs cap: max_jobs is now capped at 500 and defaults to 20 if <= 0. Confirm this behavior is acceptable.
  • Test with real connection: Tested with connection acb716fc-93c9-448b-bc81-74f349bbf60a - verified jobs returned in correct order, offset pagination works, and from_tail=False returns oldest-first
  • Test validation: Verified that passing both jobs_offset and from_tail=True raises PyAirbyteInputError

Suggested test plan: Use the MCP tool via poe mcp to call list_cloud_sync_jobs on a connection with historical jobs and verify pagination works as expected.

Notes

  • No unit tests were added for the new pagination logic
  • get_previous_sync_logs is now the single method for listing sync jobs with pagination support

Requested by: aldo.gonzalez@airbyte.io
Link to Devin run: https://app.devin.ai/sessions/b70da243df1d4ae39831615a1d93bd59

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

- Add offset and order_by parameters to api_util.get_job_logs
- Add CloudConnection.list_sync_jobs method with pagination support
- Add new MCP tool list_cloud_sync_jobs with parameters:
  - max_jobs: Maximum number of jobs to return (default 20, 0 = no limit)
  - jobs_offset: Number of jobs to skip from the beginning
  - from_tail: Order jobs newest-first (True) or oldest-first (False)
- Add SyncJobResult and SyncJobListResult Pydantic models for structured output
- Validation: Cannot combine jobs_offset with from_tail=True
- Default behavior: from_tail=True (newest jobs first)

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
@devin-ai-integration
Copy link
Contributor Author

Original prompt from aldo.gonzalez@airbyte.io
Received message in Slack channel #ask-devin-ai:

@Devin Ccan you investigate why Devin can't paginate syncs using the MCP tools? <https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1765391038279299?thread_ts=1765387691.173249&amp;cid=C08BHPUMEPJ>
Thread URL: https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1765391757365319

Quote of conversation (https://airbytehq-team.slack.com/archives/C08BHPUMEPJ/p1765391038279299?thread_ts=1765387691.173249&amp;cid=C08BHPUMEPJ):
> From Devin
> Thanks for the screenshot and clarification. I can see the recent syncs now - this is a different situation than before:
> _Current State (from your screenshot):_
> • Syncs are now _succeeding_ (green checkmarks)
> • But still _0 bytes, 0 records_ for every sync
> • Each sync takes _22-36 minutes_
> • Schema refresh succeeded on Dec 9 with the correct new stream names (`channel_basic_a3`, `channel_combined_a3`, etc.)
> _Important distinction:_ The YouTube Analytics connector uses the _YouTube Reporting API_ (bulk data exports), which is different from the YouTube Analytics API that powers the YouTube Studio dashboard. This is likely why the client sees data in their dashboard but Airbyte shows 0 records.
> _Most likely cause:_ The YouTube Reporting API requires:
> 1. _Reporting jobs to be created_ for each report type
> 2. _Time for YouTube to generate report files_ (can take 24-48+ hours after job creation)
> 3. The account may need to be a _Content Owner_ (not just a regular channel) for certain bulk reports
> _To investigate further, I need:_
> 1. Can you share the _job ID_ from one of the recent successful syncs (Dec 9-10)? The MCP API I'm using is only returning the old failed job from September.
> 2. Or if you can access the logs for the Dec 10, 9:29 AM sync, look for:
>     ◦ Any messages about "No jobs found" or "No reports found"
>     ◦ Whether job creation calls to `<http://youtubereporting.googleapis.com/v1/jobs|youtubereporting.googleapis.com/v1/jobs>` are succeedin... (280 chars truncated...)

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765393451-add-job-pagination' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1765393451-add-job-pagination'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link

github-actions bot commented Dec 10, 2025

PyTest Results (Fast Tests Only, No Creds)

348 tests  ±0   348 ✅ ±0   5m 57s ⏱️ -11s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit dd19947. ± Comparison against base commit 1b48476.

♻️ This comment has been updated with latest results.

devin-ai-integration bot and others added 2 commits December 10, 2025 19:34
…ethods

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
…ve list_sync_jobs

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
@github-actions
Copy link

github-actions bot commented Dec 10, 2025

PyTest Results (Full)

416 tests  ±0   399 ✅  - 1   24m 45s ⏱️ -30s
  1 suites ±0    17 💤 +1 
  1 files   ±0     0 ❌ ±0 

Results for commit dd19947. ± Comparison against base commit 1b48476.

This pull request skips 1 test.
tests.integration_tests.cloud.test_cloud_workspaces ‑ test_deploy_connection

♻️ This comment has been updated with latest results.

devin-ai-integration bot and others added 2 commits December 10, 2025 20:37
…issue

Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
Co-Authored-By: aldo.gonzalez@airbyte.io <aldo.gonzalez@airbyte.io>
@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit a4fa901 into main Dec 10, 2025
23 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1765393451-add-job-pagination branch December 10, 2025 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant