Skip to content
Draft
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 96 additions & 38 deletions .github/workflows/connectors_up_to_date.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ on:
workflow_dispatch:
inputs:
connectors-options:
description: "Options to pass to the 'airbyte-ci connectors' command group."
default: "--language=python --language=low-code --language=manifest-only"
description: "Extra flags to pass to `airbyte-ops local connector list` (e.g. --name=source-github)."
default: ""
Comment on lines 11 to +14

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add additional optional workflow inputs: bump-base-image, bump-cdk, and bump-external-dependencies. These can all be 'true' or 'false', defaulting to 'true'. And 'bump-cdk' can also be 'latest' which would ignore any existing CDK constraints and instead bump all the way to the latest.

However, doing so is probably out of scope for this PR. Please log an issue in the ops repo to add capability to support this. This would resolve a long-standing issue where a failed image bump can block the CDK from being updated, and vice versa. CDK updates are most relevant for behavioral and bug fixes, whereas image updates are most relevant for security patches. Both are important and we want one to be able to pass even if the other is blocked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this is out of scope for this PR. Logged as https://github.com/airbytehq/airbyte-ops-mcp/issues/625 — covers the bump-base-image, bump-cdk, and bump-external-dependencies workflow inputs and the rationale for independent failure handling.


Devin session

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created three separate issues in the ops repo and added TK-TODO(url) references in the workflow:

  1. airbytehq/airbyte-ops-mcp#626bump-base-image: Implement CLI command to update base image in metadata.yaml
  2. airbytehq/airbyte-ops-mcp#627bump-cdk: Implement CLI command to bump CDK dependency (with --latest option)
  3. airbytehq/airbyte-ops-mcp#628bump-external-dependencies: Implement CLI command for external dependency updates

Closed the combined airbytehq/airbyte-ops-mcp#625 in favor of these three.

Each TK-TODO comment in the workflow now links to its tracking issue.


Devin session

jobs:
generate_matrix:
name: Generate matrix
Expand All @@ -21,42 +21,83 @@ jobs:
steps:
- name: Checkout Airbyte
uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
- name: Run airbyte-ci connectors list [SCHEDULED TRIGGER]
if: github.event_name == 'schedule'
id: airbyte-ci-connectors-list-scheduled
uses: ./.github/actions/run-airbyte-ci
with:
context: "master"
subcommand: 'connectors --language=python --language=low-code --language=manifest-only --metadata-query="\"-rc.\" not in data.dockerImageTag and \"source-declarative-manifest\" not in data.dockerRepository" list --output=selected_connectors.json'
- name: Run airbyte-ci connectors list [MANUAL TRIGGER]
if: github.event_name == 'workflow_dispatch'
id: airbyte-ci-connectors-list-workflow-dispatch
uses: ./.github/actions/run-airbyte-ci
with:
context: "master"
subcommand: 'connectors ${{ github.event.inputs.connectors-options }} --metadata-query="\"-rc.\" not in data.dockerImageTag and \"source-declarative-manifest\" not in data.dockerRepository" list --output=selected_connectors.json'
# We generate a dynamic matrix from the list of selected connectors.
# A matrix is required in this situation because the number of connectors is large and running them all in a single job would exceed the maximum job time limit of 6 hours.
# We use 25 connectors per job, with hopes they can finish within the 1 hours duration of the GitHub App's
# token.

- name: Install uv
uses: astral-sh/setup-uv@6b9c6063abd6010835644d4c2e1bef4cf5cd0fca # v6.0.1

- name: Install airbyte-ops CLI
run: uv tool install airbyte-internal-ops

# --- Connector selection (three-call pattern) ---
# Call 1: Sources - all support levels, Python/low-code/manifest-only languages
- name: List source connectors
id: list-sources
env:
CONNECTOR_OPTIONS: ${{ github.event.inputs.connectors-options }}
run: |
SOURCES=$(
airbyte-ops local connector list \
--repo-path "$GITHUB_WORKSPACE" \
--connector-type=source \
--language python --language low-code --language manifest-only \
--exclude-connectors=source-declarative-manifest \
$CONNECTOR_OPTIONS \
--output-format=csv
)
echo "sources=$SOURCES" | tee -a $GITHUB_OUTPUT

# Call 2: Destinations - certified only, Python/low-code/manifest-only languages
- name: List destination connectors
id: list-destinations
env:
CONNECTOR_OPTIONS: ${{ github.event.inputs.connectors-options }}
run: |
DESTINATIONS=$(
airbyte-ops local connector list \
--repo-path "$GITHUB_WORKSPACE" \
--connector-type=destination \
--certified-only \
--language python --language low-code --language manifest-only \
$CONNECTOR_OPTIONS \
--output-format=csv
)
echo "destinations=$DESTINATIONS" | tee -a $GITHUB_OUTPUT

# Call 3: Combine both sets and output as GH Actions matrix
- name: Generate matrix - 25 connectors per job
id: generate_matrix
run: |
matrix=$(jq -c -r '{include: [.[] | "--name=" + .] | to_entries | group_by(.key / 25 | floor) | map(map(.value) | {"connector_names": join(" ")})}' selected_connectors.json)
echo "generated_matrix=$matrix" >> $GITHUB_OUTPUT
echo "generated_matrix=$( \
airbyte-ops local connector list \
--repo-path "$GITHUB_WORKSPACE" \
--connectors-filter="${{ steps.list-sources.outputs.sources }},${{ steps.list-destinations.outputs.destinations }}" \
--output-format=json-gh-matrix \
)" | tee -a $GITHUB_OUTPUT

run_connectors_up_to_date:
needs: generate_matrix
name: Connectors up-to-date
runs-on: connector-up-to-date-medium
continue-on-error: true
strategy:
matrix: ${{fromJson(needs.generate_matrix.outputs.generated_matrix)}}
matrix: ${{ fromJson(needs.generate_matrix.outputs.generated_matrix) }}
permissions:
pull-requests: write
contents: write
env:
CONNECTOR_NAME: ${{ matrix.connector }}
CONNECTOR_DIR: ${{ matrix.connector_dir }}
CONNECTOR_LANGUAGE: ${{ matrix.connector_language }}
steps:
- name: Checkout Airbyte
uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0

- name: Install uv
uses: astral-sh/setup-uv@6b9c6063abd6010835644d4c2e1bef4cf5cd0fca # v6.0.1

- name: Install airbyte-ops CLI
run: uv tool install airbyte-internal-ops

- name: Authenticate as 'octavia-bot-hoard' GitHub App
uses: actions/create-github-app-token@67018539274d69449ef7c02e8e71183d1719ab42 # v2.1.4
id: get-app-token
Expand All @@ -65,19 +106,36 @@ jobs:
repositories: ${{ inputs.repo || github.event.repository.name }}
app-id: ${{ secrets.OCTAVIA_BOT_HOARD_APP_ID }}
private-key: ${{ secrets.OCTAVIA_BOT_HOARD_PRIVATE_KEY }}
# Token is good for 1 hour
- name: Run airbyte-ci connectors up-to-date [WORKFLOW]
id: airbyte-ci-connectors-up-to-date-workflow-dispatch
uses: ./.github/actions/run-airbyte-ci

- name: Docker login
uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with:
context: "master"
dagger_cloud_token: ${{ secrets.DAGGER_CLOUD_TOKEN_CACHE_3 }}
docker_hub_password: ${{ secrets.DOCKER_HUB_PASSWORD }}
docker_hub_username: ${{ secrets.DOCKER_HUB_USERNAME }}
gcp_gsm_credentials: ${{ secrets.GCP_GSM_CREDENTIALS }}
gcs_credentials: ${{ secrets.METADATA_SERVICE_PROD_GCS_CREDENTIALS }}
github_token: ${{ steps.get-app-token.outputs.token }}
sentry_dsn: ${{ secrets.SENTRY_AIRBYTE_CI_DSN }}
s3_build_cache_access_key_id: ${{ secrets.SELF_RUNNER_AWS_ACCESS_KEY_ID }}
s3_build_cache_secret_key: ${{ secrets.SELF_RUNNER_AWS_SECRET_ACCESS_KEY }}
subcommand: "connectors --concurrency=10 ${{ matrix.connector_names}} up-to-date --create-prs --auto-merge"
username: ${{ secrets.DOCKER_HUB_USERNAME }}
password: ${{ secrets.DOCKER_HUB_PASSWORD }}

# TK-TODO: The `airbyte-ops local connector up-to-date` command does not exist yet.
# It needs to be implemented in the airbyte-ops-mcp repo to replace the Dagger-based
# `airbyte-ci connectors up-to-date` pipeline. The command should perform these steps
# for each connector:
# 1. Update base image in metadata.yaml to the latest stable tag
# 2. Run `poetry update` (for Python/Poetry connectors)
# 3. Bump connector version (patch)
# 4. Add changelog entry ("Update dependencies")
# 5. Create or update a GitHub PR with the changes
# 6. Optionally set auto-merge label on the PR

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can tackle most of these already without code changes to ops CLI.

1. Update base image in metadata.yaml to the latest stable tag

Create as dedicated step, but move towards end of process. Imagined CLI will be something like:

airbyte-ops local connector bump-base-image

2. Run poetry update (for Python/Poetry connectors)

You can do this already.

Simply add two steps both conditional 'if' on the language being python. First steps sets up Poetry and second step runs poetry update.

3. Bump connector version (patch)

4. Add changelog entry ("Update dependencies")

As you already hinted, we can do this already. Just call the bump-version CLI we already have and pass it an explicit changelog string.

5. Create or update a GitHub PR with the changes

We have patterns to do this already using native github actions. You'll want to do it twice: once up top so that you have a PR number to pass into the bump-version so it can use a (known) PR number in the changelog.

6. Optionally set auto-merge label on the PR

Again, we have github actions examples for your to this already natively. (1) There are labels we can add and (2) there's precedent for setting GitHub's native auto-merge. (We can just do both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in ca967db — replaced the monolithic up-to-date pseudocode with individual workflow steps per your guidance:

  1. Git author configoctavia-bot-hoard pattern (from auto-upgrade-certified-connectors-cdk.yml)
  2. Step 5apeter-evans/create-pull-request (early, to get PR number for changelog). Creates draft PR with auto-merge label.
  3. Step 2poetry update --lock conditional on CONNECTOR_LANGUAGE == 'python' (setup-python + install-poetry)
  4. Steps 3+4airbyte-ops local connector bump-version --bump-type patch --changelog-message "Update dependencies" --pr-number "$PR_NUMBER"
  5. Step 1airbyte-ops local connector bump-base-imagestill TK-TODO (command doesn't exist yet), placed near end per your guidance
  6. Step 5bgit add / git commit / git push to the PR branch, then gh pr ready
  7. Step 6gh pr merge "$PR_NUMBER" --squash --auto (native auto-merge)

Only remaining TK-TODO: bump-base-image CLI command.


Devin session

#
# Steps 3 and 4 are already available via `airbyte-ops local connector bump-version`.
# Steps 1, 2, 5, and 6 need new implementations. See the analysis in the PR description
# for detailed requirements.
- name: Run connectors up-to-date
env:
GITHUB_TOKEN: ${{ steps.get-app-token.outputs.token }}
DOCKER_HUB_USERNAME: ${{ secrets.DOCKER_HUB_USERNAME }}
DOCKER_HUB_PASSWORD: ${{ secrets.DOCKER_HUB_PASSWORD }}
run: |
airbyte-ops local connector up-to-date \
--repo-path "$GITHUB_WORKSPACE" \
--concurrency=10 \
--name="$CONNECTOR_NAME" \
--create-prs \
--auto-merge
Loading