feat(ci): CI Improvements: Manual Trigger + Cross-Platform Test Control#1016
feat(ci): CI Improvements: Manual Trigger + Cross-Platform Test Control#1016SB2318 wants to merge 3 commits intoTracer-Cloud:mainfrom
Conversation
…atrix handling fix: update timeout values for CI jobs refactor(query_opensre_telemetry.py): format code for better readability chore: add .gitattributes for consistent line endings Co-authored-by: Copilot <copilot@github.com>
Greptile SummaryThis PR adds a
Confidence Score: 4/5Safe to merge after fixing the unused full_matrix input; the remaining findings are minor quality issues One P1 issue: the advertised configurable full-matrix behavior is broken because inputs.full_matrix is never referenced in any matrix condition. The remaining findings (CloudWatch no-op, redundant credentials, incomplete secret validation) are P2 quality/clarity issues that do not block CI correctness. .github/workflows/ci.yml — specifically the matrix condition expressions on lines 46, 83, and 119 Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Trigger] --> B{event_name?}
B -->|push to main| C[Full matrix Ubuntu + Windows]
B -->|pull_request| D[Ubuntu only]
B -->|workflow_dispatch| E[full_matrix input NOT checked - always runs full matrix]
E --> C
C --> F[quality]
D --> F
F --> G[typecheck]
G --> H[test]
H --> I{push to main?}
I -->|yes| J[test-thorough with AWS]
I -->|no| K[done]
J --> J1[configure-aws-credentials action - duplicates env block vars]
J1 --> J2[sts get-caller-identity]
J2 --> J3[validate secrets - only ANTHROPIC_API_KEY checked]
J3 --> J4[probe CloudWatch - always succeeds via or-true]
J4 --> J5[Run E2E test]
Reviews (1): Last reviewed commit: "feat(ci): add workflow_dispatch for manu..." | Re-trigger Greptile |
| workflow_dispatch: | ||
| inputs: | ||
| full_matrix: | ||
| description: "Run full OS matrix (Linux + Windows)" | ||
| required: false | ||
| default: "true" |
There was a problem hiding this comment.
full_matrix input is never consumed
The workflow_dispatch input full_matrix is defined here, and the PR description explicitly states this enables "configurable full matrix execution," but none of the three matrix conditions reference it. All three jobs use github.event_name == 'workflow_dispatch' which unconditionally expands to the full OS matrix whenever the workflow is triggered manually. Setting full_matrix: false at dispatch time has no effect. The condition should be:
os: ${{ (github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && inputs.full_matrix == 'true')) && fromJSON('["ubuntu-latest","windows-latest"]') || fromJSON('["ubuntu-latest"]') }}This same fix is needed on lines 46, 83, and 119.
There was a problem hiding this comment.
Good catch — you're absolutely right.
The full_matrix input was defined but not wired into the matrix condition, so manual runs always executed the full OS matrix regardless of the input value.
I've updated the matrix logic across all jobs to respect inputs.full_matrix, making manual execution properly configurable.
Thanks for pointing this out.
| - name: Validate CloudWatch access (safe) | ||
| run: aws logs describe-log-groups --limit 1 || true |
There was a problem hiding this comment.
CloudWatch validation step always succeeds
The || true idiom silently swallows any failure from aws logs describe-log-groups, making this step incapable of surfacing a permissions or connectivity problem. Calling it "Validate CloudWatch access" is misleading when the step can never report failure. If the intent is truly to validate access, remove the trailing || true; if this is a best-effort probe, rename the step accordingly (e.g., "Probe CloudWatch access (non-blocking)").
There was a problem hiding this comment.
Good point — the current step name implies validation, but the || true makes it non-blocking and hides failures.
I've updated this to a non-blocking probe with explicit logging so failures are visible without failing the workflow.
This keeps the intent clear while avoiding unnecessary CI failures for transient or non-critical issues.
| - name: Configure AWS credentials | ||
| uses: aws-actions/configure-aws-credentials@v4 | ||
| with: | ||
| aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} | ||
| aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
| aws-region: us-east-1 | ||
| mask-aws-account-id: true |
There was a problem hiding this comment.
AWS credentials configured redundantly
The test-thorough job already sets AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, and AWS_DEFAULT_REGION in its env: block (lines 264–267). Adding aws-actions/configure-aws-credentials@v4 on top duplicates the credential setup. Consider removing the raw env vars and relying solely on the action, or vice versa.
There was a problem hiding this comment.
Good point — the duplication is valid.
However, some of your tests currently rely on AWS credentials being available via environment variables (e.g., using os.getenv), so removing them caused test failures.
For now, I’ve kept the configure-aws-credentials action as the primary setup and explicitly exported the required environment variables for test compatibility.
Longer term, we can refactor tests to rely on the AWS SDK credential chain instead of direct environment access.
| - name: Validate required runtime secrets | ||
| run: | | ||
| if [ -z "$ANTHROPIC_API_KEY" ]; then | ||
| echo "Missing ANTHROPIC_API_KEY" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Secret validation incomplete and oddly indented
Only ANTHROPIC_API_KEY is checked here, yet test-thorough also depends on JWT_TOKEN, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY. A missing JWT_TOKEN would cause a hard-to-diagnose failure later in "Initialize tracer". Additionally, the shell if body uses a single-space indent inconsistent with the rest of the file.
Co-authored-by: Copilot <copilot@github.com>
|
Thanks for the detailed review — all points are valid. I’ve fixed the matrix expression precedence to ensure I also cleaned up the secret validation logic for readability and early-fail behavior. For AWS setup, I’ve aligned authentication around Finally, I clarified the CloudWatch validation step behavior to avoid ambiguity between probing and strict validation. @muddlebee please review. |
Fixes #926
Describe the changes you have made in this PR -
This PR enhances the CI pipeline by introducing a manual trigger (workflow_dispatch) and improving cross-platform test execution control.
Demo/Screenshot for feature changes and bug fixes -
Code Understanding and AI Usage
Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?
If you used AI assistance:
Explain your implementation approach:
✨ Key Changes
1. ✅ Manual CI Trigger Added
workflow_dispatchto allow on-demand CI runs2. ⚙️ Improved Matrix Strategy
main→ run Ubuntu + Windows3. 🔐 Safe Test Handling for Fork PRs
4. 🧪 CI Structure Maintained
quality→ lint + formattypecheck→ mypy validationtest→ main CI gatetest-kubernetes→ conditional infra teststest-thorough→ extended E2E (main branch only)5. ☁️ AWS Validation (E2E Stability)
aws sts get-caller-identityChecklist before requesting a review
Note: Please check Allow edits from maintainers if you would like us to assist in the PR.