Skip to content

feat(ci): CI Improvements: Manual Trigger + Cross-Platform Test Control#1016

Open
SB2318 wants to merge 3 commits intoTracer-Cloud:mainfrom
SB2318:auth-ci-fix
Open

feat(ci): CI Improvements: Manual Trigger + Cross-Platform Test Control#1016
SB2318 wants to merge 3 commits intoTracer-Cloud:mainfrom
SB2318:auth-ci-fix

Conversation

@SB2318
Copy link
Copy Markdown
Contributor

@SB2318 SB2318 commented Apr 27, 2026

Fixes #926

Describe the changes you have made in this PR -

This PR enhances the CI pipeline by introducing a manual trigger (workflow_dispatch) and improving cross-platform test execution control.

Demo/Screenshot for feature changes and bug fixes -


Code Understanding and AI Usage

Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?

  • No, I wrote all the code myself
  • Yes, I used AI assistance (continue below)

If you used AI assistance:

  • I have reviewed every single line of the AI-generated code
  • I can explain the purpose and logic of each function/component I added
  • I have tested edge cases and understand how the code handles them
  • I have modified the AI output to follow this project's coding standards and conventions

Explain your implementation approach:

✨ Key Changes

1. ✅ Manual CI Trigger Added

  • Added workflow_dispatch to allow on-demand CI runs
  • Supports running full test suite across Linux + Windows

2. ⚙️ Improved Matrix Strategy

  • PRs → run Ubuntu only (fast feedback)
  • Push to main → run Ubuntu + Windows
  • Manual trigger → configurable full matrix execution

3. 🔐 Safe Test Handling for Fork PRs

  • Forked PRs run safe tests only (no secrets exposure)
  • Internal PRs and pushes run full test suite

4. 🧪 CI Structure Maintained

  • quality → lint + format
  • typecheck → mypy validation
  • test → main CI gate
  • test-kubernetes → conditional infra tests
  • test-thorough → extended E2E (main branch only)

5. ☁️ AWS Validation (E2E Stability)

  • Added AWS credential configuration
  • Added runtime validation:
    • aws sts get-caller-identity
    • CloudWatch access check

Checklist before requesting a review

  • I have added proper PR title and linked to the issue
  • I have performed a self-review of my code
  • I can explain the purpose of every function, class, and logic block I added
  • I understand why my changes work and have tested them thoroughly
  • I have considered potential edge cases and how my code handles them
  • If it is a core feature, I have added thorough tests
  • My code follows the project's style guidelines and conventions

Note: Please check Allow edits from maintainers if you would like us to assist in the PR.

…atrix handling

fix: update timeout values for CI jobs

refactor(query_opensre_telemetry.py): format code for better readability

chore: add .gitattributes for consistent line endings

Co-authored-by: Copilot <copilot@github.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

This PR adds a workflow_dispatch trigger to the CI pipeline with a full_matrix input, expands the OS matrix for manual and push events to include Windows, adds AWS credential setup and validation steps to test-thorough, and includes minor formatting/gitattributes changes.

  • P1: The full_matrix dispatch input is declared but never consumed — all three matrix conditions use github.event_name == 'workflow_dispatch' without checking inputs.full_matrix, so manually triggering with full_matrix: false has no effect and the advertised "configurable" behavior does not work.

Confidence Score: 4/5

Safe to merge after fixing the unused full_matrix input; the remaining findings are minor quality issues

One P1 issue: the advertised configurable full-matrix behavior is broken because inputs.full_matrix is never referenced in any matrix condition. The remaining findings (CloudWatch no-op, redundant credentials, incomplete secret validation) are P2 quality/clarity issues that do not block CI correctness.

.github/workflows/ci.yml — specifically the matrix condition expressions on lines 46, 83, and 119

Important Files Changed

Filename Overview
.github/workflows/ci.yml Adds workflow_dispatch trigger and AWS credential validation steps, but the full_matrix dispatch input is never referenced in the matrix conditions (always runs full matrix), and the CloudWatch validation step silently swallows failures
app/.gitattributes New gitattributes enforcing LF line endings under app/ — straightforward cross-platform hygiene, no issues
scripts/query_opensre_telemetry.py Minor formatting fix: wraps a long add_parser call into a Black-compliant multi-line form — no logic changes

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Trigger] --> B{event_name?}
    B -->|push to main| C[Full matrix Ubuntu + Windows]
    B -->|pull_request| D[Ubuntu only]
    B -->|workflow_dispatch| E[full_matrix input NOT checked - always runs full matrix]
    E --> C
    C --> F[quality]
    D --> F
    F --> G[typecheck]
    G --> H[test]
    H --> I{push to main?}
    I -->|yes| J[test-thorough with AWS]
    I -->|no| K[done]
    J --> J1[configure-aws-credentials action - duplicates env block vars]
    J1 --> J2[sts get-caller-identity]
    J2 --> J3[validate secrets - only ANTHROPIC_API_KEY checked]
    J3 --> J4[probe CloudWatch - always succeeds via or-true]
    J4 --> J5[Run E2E test]
Loading

Reviews (1): Last reviewed commit: "feat(ci): add workflow_dispatch for manu..." | Re-trigger Greptile

Comment thread .github/workflows/ci.yml
Comment on lines +24 to +29
workflow_dispatch:
inputs:
full_matrix:
description: "Run full OS matrix (Linux + Windows)"
required: false
default: "true"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 full_matrix input is never consumed

The workflow_dispatch input full_matrix is defined here, and the PR description explicitly states this enables "configurable full matrix execution," but none of the three matrix conditions reference it. All three jobs use github.event_name == 'workflow_dispatch' which unconditionally expands to the full OS matrix whenever the workflow is triggered manually. Setting full_matrix: false at dispatch time has no effect. The condition should be:

os: ${{ (github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && inputs.full_matrix == 'true')) && fromJSON('["ubuntu-latest","windows-latest"]') || fromJSON('["ubuntu-latest"]') }}

This same fix is needed on lines 46, 83, and 119.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — you're absolutely right.

The full_matrix input was defined but not wired into the matrix condition, so manual runs always executed the full OS matrix regardless of the input value.

I've updated the matrix logic across all jobs to respect inputs.full_matrix, making manual execution properly configurable.

Thanks for pointing this out.

Comment thread .github/workflows/ci.yml Outdated
Comment on lines +315 to +316
- name: Validate CloudWatch access (safe)
run: aws logs describe-log-groups --limit 1 || true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 CloudWatch validation step always succeeds

The || true idiom silently swallows any failure from aws logs describe-log-groups, making this step incapable of surfacing a permissions or connectivity problem. Calling it "Validate CloudWatch access" is misleading when the step can never report failure. If the intent is truly to validate access, remove the trailing || true; if this is a best-effort probe, rename the step accordingly (e.g., "Probe CloudWatch access (non-blocking)").

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — the current step name implies validation, but the || true makes it non-blocking and hides failures.

I've updated this to a non-blocking probe with explicit logging so failures are visible without failing the workflow.

This keeps the intent clear while avoiding unnecessary CI failures for transient or non-critical issues.

Comment thread .github/workflows/ci.yml
Comment on lines +295 to +301
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
mask-aws-account-id: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 AWS credentials configured redundantly

The test-thorough job already sets AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, and AWS_DEFAULT_REGION in its env: block (lines 264–267). Adding aws-actions/configure-aws-credentials@v4 on top duplicates the credential setup. Consider removing the raw env vars and relying solely on the action, or vice versa.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — the duplication is valid.

However, some of your tests currently rely on AWS credentials being available via environment variables (e.g., using os.getenv), so removing them caused test failures.

For now, I’ve kept the configure-aws-credentials action as the primary setup and explicitly exported the required environment variables for test compatibility.

Longer term, we can refactor tests to rely on the AWS SDK credential chain instead of direct environment access.

Comment thread .github/workflows/ci.yml
Comment on lines +308 to +313
- name: Validate required runtime secrets
run: |
if [ -z "$ANTHROPIC_API_KEY" ]; then
echo "Missing ANTHROPIC_API_KEY"
exit 1
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Secret validation incomplete and oddly indented

Only ANTHROPIC_API_KEY is checked here, yet test-thorough also depends on JWT_TOKEN, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY. A missing JWT_TOKEN would cause a hard-to-diagnose failure later in "Initialize tracer". Additionally, the shell if body uses a single-space indent inconsistent with the rest of the file.

SB2318 and others added 2 commits April 28, 2026 01:55
Co-authored-by: Copilot <copilot@github.com>
@SB2318
Copy link
Copy Markdown
Contributor Author

SB2318 commented Apr 27, 2026

Thanks for the detailed review — all points are valid.

I’ve fixed the matrix expression precedence to ensure full_matrix behaves correctly in workflow_dispatch.

I also cleaned up the secret validation logic for readability and early-fail behavior.

For AWS setup, I’ve aligned authentication around configure-aws-credentials and ensured consistency across jobs.

Finally, I clarified the CloudWatch validation step behavior to avoid ambiguity between probing and strict validation.

@muddlebee please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] CI instability: ruff formatting + multiple test failures — request 1-day observation window

1 participant