Add DigitalOcean infrastructure e2e monitoring workflow#4259
Add DigitalOcean infrastructure e2e monitoring workflow#4259
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
4 Skipped Deployments
|
| runs-on: ubuntu-latest | ||
| outputs: | ||
| environment_url: ${{ steps.vercel.outputs.environment_url }} | ||
| steps: | ||
| - name: Check out repository | ||
| uses: actions/checkout@v4 | ||
| - name: Get DO branch Vercel preview URL | ||
| id: vercel | ||
| env: | ||
| BRANCH_NAME: ${{ env.DO_BRANCH }} | ||
| FILTER_BRANCH: ${{ env.DO_BRANCH }} | ||
| VERCEL_TOKEN: ${{ secrets.VERCEL_TOKEN }} | ||
| VERCEL_PROJECT: ${{ secrets.VERCEL_PROJECT }} | ||
| run: | | ||
| echo "Looking for latest READY deployment for branch: $FILTER_BRANCH" | ||
| cd .github && python await_deployment.py | ||
| - name: Echo resolved URL | ||
| env: | ||
| environment_url: ${{ steps.vercel.outputs.environment_url }} | ||
| run: | | ||
| echo "DO Preview URL: https://$environment_url" | ||
| echo "This preview is configured to use DO endpoints via Vercel branch env vars" | ||
|
|
||
| # Server synthetic tests against DO SQS | ||
| server-e2e-tests: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 28 days ago
To fix the issue, add an explicit permissions block that limits the GITHUB_TOKEN to the minimal required scope. From the provided snippet, the jobs only need to read the repository (for actions/checkout) and do not appear to perform any GitHub write operations. A safe minimal configuration is contents: read at the workflow root so it applies to all jobs that don’t override it.
Concretely, in .github/workflows/monitoring-do-e2e-tests.yml, add a top‑level permissions: block after the on: section (e.g., after line 16). This block should specify contents: read. No changes are needed to individual jobs unless some job (not shown) truly needs additional scopes; based on the provided snippet, that is not the case. No imports or external tools are required; this is purely a YAML configuration change within the workflow file.
| @@ -15,6 +15,9 @@ | ||
| type: boolean | ||
| # Schedule removed - triggered by dispatch-do-monitoring.yml on stage branch | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| # DO endpoint configuration | ||
| env: | ||
| DO_BRANCH: "jason/do-endpoints-e2e" |
| name: do-server-tests | ||
| needs: resolve-do-preview | ||
| runs-on: ubuntu-latest | ||
| if: ${{ github.event.inputs.skip_server_tests != 'true' }} | ||
| steps: | ||
| - name: Checkout Repo | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Setup Node.js | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: 22.x | ||
|
|
||
| - name: Cache dependencies | ||
| uses: actions/cache@v4 | ||
| with: | ||
| path: "**/node_modules" | ||
| key: ${{ runner.OS }}-22.x-${{ hashFiles('**/yarn.lock') }} | ||
| restore-keys: | | ||
| ${{ runner.OS }}-22.x- | ||
|
|
||
| - name: Install Dependencies | ||
| run: yarn install --frozen-lockfile | ||
|
|
||
| - name: Echo DO SQS Server URL | ||
| run: | | ||
| echo "Testing SQS Server URL: ${{ env.DO_SQS_ENDPOINT }}" | ||
|
|
||
| - name: Run Server Tests against DO SQS | ||
| id: tests | ||
| run: yarn test:e2e --filter=@osmosis-labs/server | ||
| env: | ||
| NEXT_PUBLIC_SIDECAR_BASE_URL: ${{ env.DO_SQS_ENDPOINT }} | ||
| NEXT_PUBLIC_TIMESERIES_DATA_URL: https://data.app.osmosis.zone | ||
|
|
||
| # FE swap monitoring tests (US region - no proxy) | ||
| fe-swap-us: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 28 days ago
In general, the fix is to add an explicit permissions: block that grants the minimal necessary access to the GITHUB_TOKEN. This can be done at the workflow root (applies to all jobs) or for individual jobs. Since all visible jobs only need to read repository contents, the minimal safe baseline is permissions: { contents: read }.
The best fix here, without changing functionality, is to define a single workflow-level permissions: block near the top of .github/workflows/monitoring-do-e2e-tests.yml, after the on: section and before env: or jobs:. This will satisfy CodeQL’s complaint about the server-e2e-tests job (and all other jobs), while keeping functionality unchanged, because they only require read access to check out code and use caches/artifacts. No additional methods, imports, or definitions are needed; this is purely a YAML configuration change.
Specifically, in .github/workflows/monitoring-do-e2e-tests.yml, insert:
permissions:
contents: readbetween the on: block (ending at line 16) and the env: block (starting at line 19). No other changes are required.
| @@ -15,6 +15,9 @@ | ||
| type: boolean | ||
| # Schedule removed - triggered by dispatch-do-monitoring.yml on stage branch | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| # DO endpoint configuration | ||
| env: | ||
| DO_BRANCH: "jason/do-endpoints-e2e" |
| timeout-minutes: 15 | ||
| name: do-fe-swap | ||
| needs: resolve-do-preview | ||
| runs-on: macos-14 | ||
| steps: | ||
| - name: Echo IP | ||
| run: curl -L "https://ipinfo.io" -s | ||
| - name: Check out repository | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| sparse-checkout: | | ||
| packages/e2e | ||
| - name: Setup Node.js | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: 22.x | ||
| - name: Cache dependencies | ||
| uses: actions/cache@v4 | ||
| with: | ||
| path: "**/node_modules" | ||
| key: ${{ runner.OS }}-22.x-${{ hashFiles('**/yarn.lock') }} | ||
| restore-keys: | | ||
| ${{ runner.OS }}-22.x- | ||
| - name: Install Playwright | ||
| run: | | ||
| yarn --cwd packages/e2e install --frozen-lockfile && npx playwright install --with-deps chromium | ||
| - name: Run Swap tests against DO preview | ||
| env: | ||
| BASE_URL: "https://${{ needs.resolve-do-preview.outputs.environment_url }}" | ||
| REST_ENDPOINT: ${{ env.DO_LCD_ENDPOINT }} | ||
| PRIVATE_KEY: ${{ secrets.TEST_PRIVATE_KEY_3 }} | ||
| WALLET_ID: ${{ secrets.TEST_WALLET_ID_3 }} | ||
| run: | | ||
| echo "Testing DO preview at: $BASE_URL" | ||
| cd packages/e2e | ||
| npx playwright test monitoring.swap | ||
| - name: upload test results | ||
| if: failure() | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: do-swap-test-results-${{ github.run_id }} | ||
| path: packages/e2e/playwright-report | ||
|
|
||
| # FE market order test | ||
| fe-trade: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 28 days ago
To fix the problem, explicitly set permissions for the workflow so the GITHUB_TOKEN has only the minimal scopes required. Since the jobs here only read repository contents and upload artifacts, we can safely set contents: read at the workflow level. This documents the requirement and ensures the workflow won’t gain excessive privileges if repo/org defaults change.
The best minimal change is to add a root-level permissions: block near the top of .github/workflows/monitoring-do-e2e-tests.yml, after the on: trigger (or before env: / jobs:). This will apply to all jobs (resolve-do-preview, server-e2e-tests, fe-swap, fe-limit, etc.) that don’t have their own permissions block. No additional imports, methods, or definitions are needed; it’s a pure YAML configuration change.
Concretely:
- Edit
.github/workflows/monitoring-do-e2e-tests.yml. - Insert:
permissions:
contents: readat the root level (same indentation as on: and env:), for example between the on: block ending at line 16 and the env: block starting at line 19. This constrains the GITHUB_TOKEN for every job and addresses the CodeQL warning.
| @@ -15,6 +15,9 @@ | ||
| type: boolean | ||
| # Schedule removed - triggered by dispatch-do-monitoring.yml on stage branch | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| # DO endpoint configuration | ||
| env: | ||
| DO_BRANCH: "jason/do-endpoints-e2e" |
| timeout-minutes: 15 | ||
| needs: [resolve-do-preview, fe-swap] | ||
| name: do-fe-limit | ||
| runs-on: macos-14 | ||
| outputs: | ||
| unexpected: ${{ steps.set-output.outputs.unexpected }} | ||
| steps: | ||
| - name: Check out repository | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| sparse-checkout: | | ||
| packages/e2e | ||
| - name: Setup Node.js | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: 22.x | ||
| - name: Cache dependencies | ||
| uses: actions/cache@v4 | ||
| with: | ||
| path: "**/node_modules" | ||
| key: ${{ runner.OS }}-22.x-${{ hashFiles('**/yarn.lock') }} | ||
| restore-keys: | | ||
| ${{ runner.OS }}-22.x- | ||
| - name: Install Playwright | ||
| run: | | ||
| yarn --cwd packages/e2e install --frozen-lockfile && npx playwright install --with-deps chromium | ||
| - name: Run Limit tests against DO preview | ||
| env: | ||
| BASE_URL: "https://${{ needs.resolve-do-preview.outputs.environment_url }}" | ||
| REST_ENDPOINT: ${{ env.DO_LCD_ENDPOINT }} | ||
| PRIVATE_KEY: ${{ secrets.TEST_PRIVATE_KEY_3 }} | ||
| WALLET_ID: ${{ secrets.TEST_WALLET_ID_3 }} | ||
| run: | | ||
| cd packages/e2e | ||
| npx playwright test monitoring.limit --timeout 180000 | ||
| - name: set-output | ||
| if: failure() || success() | ||
| id: set-output | ||
| run: echo "unexpected=$(jq -r '.stats.unexpected' packages/e2e/playwright-report/test-results.json)" >> $GITHUB_OUTPUT | ||
| - name: upload test results | ||
| if: failure() | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: do-limit-test-results-${{ github.run_id }} | ||
| path: packages/e2e/playwright-report | ||
|
|
||
| # Alert on critical failures (2nd attempt with multiple failures) | ||
| fe-bot-alert: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 28 days ago
In general, to fix this issue you should explicitly define a permissions: block either at the root of the workflow (to apply to all jobs by default) or per-job, and restrict the GITHUB_TOKEN to the least privileges necessary. For workflows that only need to read code and interact with Actions features (like artifacts and cache) but do not push commits, modify releases, or change issues/PRs, contents: read is usually sufficient as a base.
For this specific workflow in .github/workflows/monitoring-do-e2e-tests.yml, none of the shown jobs perform write operations against the GitHub API or repository contents. They use actions/checkout, actions/cache, actions/upload-artifact, run Playwright tests, and call external services (Vercel, Datadog) using secrets. These all function with a read-only contents permission. The simplest, least-invasive fix is therefore:
- Add a top-level
permissions:block immediately after thename:(or beforeon:) that setscontents: read. - This will apply to all jobs in the workflow, including the
fe-limitjob at line 135 that CodeQL specifically flagged, without changing any functional behavior.
No new methods, definitions, or imports are needed; it’s just a YAML configuration change.
| @@ -1,5 +1,8 @@ | ||
| name: Synthetic DO Load Balanced Infrastructure Monitoring tests | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| # This workflow validates the DigitalOcean global infrastructure endpoints | ||
| # by running e2e tests against a Vercel preview configured to use DO RPC/LCD/SQS. | ||
| # Used for migration validation before shifting traffic from GCP to DO. |
| runs-on: ubuntu-latest | ||
| needs: [server-e2e-tests, fe-limit] | ||
| if: failure() && github.run_attempt == 2 && needs.fe-limit.outputs.unexpected > 1 | ||
| steps: | ||
| - name: Install Datadog CI | ||
| run: | | ||
| echo "Installing Datadog CI..." | ||
| curl -L --fail "https://github.com/DataDog/datadog-ci/releases/download/v4.1.2/datadog-ci_linux-x64" --output "/usr/local/bin/datadog-ci" | ||
| chmod +x /usr/local/bin/datadog-ci | ||
| echo "Datadog CI installed" | ||
|
|
||
| - name: Verify Datadog CI Installation | ||
| run: | | ||
| echo "Verifying Datadog CI installation..." | ||
| datadog-ci version | ||
| echo "Datadog CI is ready to use" | ||
|
|
||
| - name: Send Datadog alert for DO infrastructure failure | ||
| env: | ||
| DD_API_KEY: ${{ secrets.DATADOG_API_KEY }} | ||
| DD_APP_KEY: ${{ secrets.DATADOG_APPLICATION_KEY }} | ||
| DD_SITE: ${{ secrets.DATADOG_SITE }} | ||
| run: | | ||
| echo "Sending DO infrastructure failure metrics to Datadog..." | ||
|
|
||
| # Tag as DO infrastructure failure | ||
| datadog-ci tag --level pipeline \ | ||
| --tags "critical_failure:true" \ | ||
| --tags "infrastructure:digitalocean" \ | ||
| --tags "dd_gh_run_attempt:${{ github.run_attempt }}" | ||
|
|
||
| # Add tags for unexpected failure counts | ||
| datadog-ci tag --level pipeline \ | ||
| --tags "do_fe_limit_unexpected:${{ needs.fe-limit.outputs.unexpected }}" | ||
|
|
||
| echo "Metrics sent to Datadog successfully" | ||
|
|
||
| # Clean up deployments | ||
| delete-deployments: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 28 days ago
In general, fix this by adding an explicit permissions: block that scopes the GITHUB_TOKEN to the minimum required. Where jobs only need to read repository contents or metadata, use contents: read. Where a job must modify deployments (like delete-deployments), grant deployments: write for that job only. This documents the workflow’s needs and prevents accidental escalation if org/repo defaults change.
For this workflow, the simplest and least intrusive fix is:
- Add a root-level
permissions:block (just belowon:) that sets the token to read-only:contents: read. - Add a job-level
permissions:block todelete-deploymentsthat grants the additionaldeployments: writepermission thatactions/github-scriptneeds to list, mark inactive, and delete deployments. - Leave other jobs (e.g.,
resolve-do-preview,server-e2e-tests,fe-limit,fe-bot-alert) with the inherited read-only permissions, since they do not appear to use write operations on the GitHub API.
Concretely:
-
In
.github/workflows/monitoring-do-e2e-tests.yml, insert:permissions: contents: read
after the
on:block (after line 16/18 region, beforeenv:). -
In the
delete-deploymentsjob definition (line 221 onwards), insert:permissions: contents: read deployments: write
immediately under
delete-deployments:and beforeruns-on:. This overrides the workflow default for that job only and grants the minimal required write scope.
No new imports or external libraries are needed, just YAML changes to the workflow.
| @@ -15,6 +15,9 @@ | ||
| type: boolean | ||
| # Schedule removed - triggered by dispatch-do-monitoring.yml on stage branch | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| # DO endpoint configuration | ||
| env: | ||
| DO_BRANCH: "jason/do-endpoints-e2e" | ||
| @@ -219,6 +222,9 @@ | ||
|
|
||
| # Clean up deployments | ||
| delete-deployments: | ||
| permissions: | ||
| contents: read | ||
| deployments: write | ||
| runs-on: ubuntu-latest | ||
| if: always() | ||
| needs: [server-e2e-tests, fe-limit, fe-bot-alert] |
| runs-on: ubuntu-latest | ||
| if: always() | ||
| needs: [server-e2e-tests, fe-limit, fe-bot-alert] | ||
| steps: | ||
| - name: Delete Previous deployments | ||
| uses: actions/github-script@v7 | ||
| with: | ||
| debug: true | ||
| script: | | ||
| const deployments = await github.rest.repos.listDeployments({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| sha: context.sha | ||
| }); | ||
| await Promise.all( | ||
| deployments.data.map(async (deployment) => { | ||
| await github.rest.repos.createDeploymentStatus({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| deployment_id: deployment.id, | ||
| state: 'inactive' | ||
| }); | ||
| return github.rest.repos.deleteDeployment({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| deployment_id: deployment.id | ||
| }); | ||
| }) | ||
| ); |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 28 days ago
At a high level, the fix is to define explicit GITHUB_TOKEN permissions for this workflow, reducing them to the minimum needed. The simplest and safest approach is to add a permissions: block at the workflow root that grants read-only access to repository contents and other resources by default, and then override permissions for specific jobs that need additional write scopes.
For this workflow, most jobs are running tests and uploading artifacts; they only need contents: read and id-token: write if they use OIDC (not shown here). The fe-bot-alert job only sends data to Datadog using API keys in secrets and does not call GitHub APIs, so it can use the default read-only permissions. The delete-deployments job uses the GitHub REST API (via actions/github-script) to list deployments and then update and delete them, which require deployments: write. To implement the fix without changing existing behavior, we will:
- Add a workflow-level
permissions:block after theon:section, settingcontents: readas the default (and you could add other read scopes if needed elsewhere). - Add a job-level
permissions:block underdelete-deployments:grantingdeployments: write(and keepingcontents: read), so that this job has the rights it needs while other jobs stay read-only.
All changes are confined to .github/workflows/monitoring-do-e2e-tests.yml in the shown regions: one insertion near the top (after the on: block, around line 16–18), and one insertion under the delete-deployments job (around line 221–223). No new imports or external dependencies are required.
| @@ -15,6 +15,9 @@ | ||
| type: boolean | ||
| # Schedule removed - triggered by dispatch-do-monitoring.yml on stage branch | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| # DO endpoint configuration | ||
| env: | ||
| DO_BRANCH: "jason/do-endpoints-e2e" | ||
| @@ -222,6 +225,9 @@ | ||
| runs-on: ubuntu-latest | ||
| if: always() | ||
| needs: [server-e2e-tests, fe-limit, fe-bot-alert] | ||
| permissions: | ||
| contents: read | ||
| deployments: write | ||
| steps: | ||
| - name: Delete Previous deployments | ||
| uses: actions/github-script@v7 |
What is the purpose of the change:
WIP. This PR allows us to do playwright tests on vercel preview links, using DigitalOcean endpoints.
NB: Endpoints need to be manually set for this specific branch in Vercel secrets!
Linear Task
MER-59: Set up DigitalOcean Migration monitoring
Brief Changelog
Testing and Verifying