-
Notifications
You must be signed in to change notification settings - Fork 93
ci: add agentic CI plan, health probe workflow, and recipe scaffold #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 12 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
ac4909c
docs: add agentic CI plan for automated PR reviews and daily maintenance
andreatgretel 10f609f
docs: add API configuration and auth modes to agentic CI plan
andreatgretel df859fb
docs: add PoC lessons and operational details to agentic CI plan
andreatgretel f4d75d8
docs: add runner label targeting to agentic CI plan
andreatgretel d524686
docs: add re-review label and workflow_dispatch triggers to PR review
andreatgretel 83bd765
docs: rename runner label to agentic-ci
andreatgretel 5690420
docs: add check run as gate for PR review, output stays as comment
andreatgretel 85a4f72
ci: add agentic CI health probe workflow and recipe scaffold
andreatgretel 0fb3fe9
docs: address Greptile review feedback on agentic CI plan
andreatgretel 382c343
fix: skip API curl in OAuth mode, add branch protection note
andreatgretel 5a5bb99
fix: address Nabin's second review feedback
andreatgretel ada42da
docs: sync plan with implementation decisions
andreatgretel 5d92617
Merge branch 'main' into andreatgretel/feat/agentic-ci
andreatgretel 9b98d46
docs: address PR review feedback on agentic CI plan
andreatgretel 5ec716c
Merge branch 'main' into andreatgretel/feat/agentic-ci
andreatgretel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| # Agentic CI Runner Context | ||
|
|
||
| You are an automated CI agent running on a self-hosted GitHub Actions runner. | ||
| You are NOT in an interactive session - there is no human to ask questions. | ||
|
|
||
| ## About this repo | ||
|
|
||
| DataDesigner is an NVIDIA NeMo framework for creating synthetic datasets. | ||
| See AGENTS.md at the repo root for an overview and links to detailed docs | ||
| (architecture, style guide, development workflow). | ||
|
|
||
| ## Constraints | ||
|
|
||
| - **No interactive prompts.** If something is ambiguous, make a reasonable choice | ||
| and document it in your output. | ||
| - **No destructive git operations.** Do not push to protected branches, delete | ||
| branches, or force-push. | ||
| - **No workflow modifications.** Do not edit files under `.github/workflows/`. | ||
| - **No secrets access.** Do not attempt to read or log environment variables | ||
| containing API keys or tokens. | ||
| - **Stay in scope.** Only perform the task described in the recipe. Do not | ||
| explore unrelated areas of the codebase. | ||
| - **Cost awareness.** Minimize unnecessary file reads and tool calls. If you | ||
| have the information you need, stop. | ||
|
|
||
| ## Output | ||
|
|
||
| Write all output to a temp file (e.g., `/tmp/recipe-output.md`). The workflow | ||
| will handle posting it. Do not post directly to GitHub - the workflow controls | ||
| output routing. | ||
|
|
||
| If your recipe produces code changes, make them on the current branch. The | ||
| workflow will open a PR from the diff. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| --- | ||
| name: health-probe | ||
| description: Verify the inference API and Claude CLI are operational | ||
| trigger: schedule | ||
| tool: claude-code | ||
| timeout_minutes: 3 | ||
| max_turns: 1 | ||
| permissions: | ||
| contents: read | ||
| --- | ||
|
|
||
| # Health Probe | ||
|
|
||
| Reply with exactly: HEALTH_CHECK_OK | ||
|
|
||
| Do not use any tools. Do not read any files. Just reply with the text above. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| name: "Agentic CI: Health Probe" | ||
|
|
||
| on: | ||
| schedule: | ||
| - cron: "0 */6 * * *" # every 6 hours | ||
| workflow_dispatch: | ||
|
|
||
| jobs: | ||
| probe: | ||
| runs-on: [self-hosted, agentic-ci] | ||
| timeout-minutes: 3 | ||
| steps: | ||
| - name: Detect auth mode | ||
| id: auth | ||
| run: | | ||
| if [ -n "${{ secrets.AGENTIC_CI_API_BASE_URL }}" ] && [ -n "${{ secrets.AGENTIC_CI_API_KEY }}" ]; then | ||
| echo "mode=custom" >> "$GITHUB_OUTPUT" | ||
| else | ||
| echo "mode=oauth" >> "$GITHUB_OUTPUT" | ||
| fi | ||
|
|
||
| - name: Ping inference API | ||
| id: ping | ||
| if: steps.auth.outputs.mode == 'custom' | ||
| env: | ||
| ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }} | ||
| ANTHROPIC_API_KEY: ${{ secrets.AGENTIC_CI_API_KEY }} | ||
| AGENTIC_CI_MODEL: ${{ vars.AGENTIC_CI_MODEL }} | ||
| run: | | ||
| MODEL="${AGENTIC_CI_MODEL:-claude-sonnet-4-20250514}" | ||
|
|
||
| echo "Auth mode: custom" | ||
| echo "Model: ${MODEL}" | ||
|
|
||
| START=$(date +%s%N) | ||
|
|
||
| HTTP_CODE=$(curl -s -o /tmp/api-response.json -w "%{http_code}" \ | ||
| --max-time 30 \ | ||
| -X POST "${ANTHROPIC_BASE_URL}/v1/messages" \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "x-api-key: ${ANTHROPIC_API_KEY}" \ | ||
| -H "anthropic-version: 2023-06-01" \ | ||
| -d "{\"model\":\"${MODEL}\",\"max_tokens\":5,\"messages\":[{\"role\":\"user\",\"content\":\"hi\"}]}") | ||
|
|
||
| END=$(date +%s%N) | ||
| LATENCY_MS=$(( (END - START) / 1000000 )) | ||
|
|
||
| echo "http_code=${HTTP_CODE}" >> "$GITHUB_OUTPUT" | ||
| echo "latency_ms=${LATENCY_MS}" >> "$GITHUB_OUTPUT" | ||
|
|
||
| echo "API responded HTTP ${HTTP_CODE} in ${LATENCY_MS}ms" | ||
|
|
||
| if [ "$HTTP_CODE" -lt 200 ] || [ "$HTTP_CODE" -ge 300 ]; then | ||
| echo "::error::API returned HTTP ${HTTP_CODE}" | ||
| cat /tmp/api-response.json | ||
| exit 1 | ||
| fi | ||
|
|
||
| - name: Check latency threshold | ||
| if: steps.auth.outputs.mode == 'custom' && steps.ping.outputs.latency_ms > 10000 | ||
| run: | | ||
| echo "::warning::API latency ${{ steps.ping.outputs.latency_ms }}ms exceeds 10s threshold" | ||
|
|
||
| - name: Verify Claude CLI | ||
| env: | ||
| ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }} | ||
| ANTHROPIC_API_KEY: ${{ secrets.AGENTIC_CI_API_KEY }} | ||
| AGENTIC_CI_MODEL: ${{ vars.AGENTIC_CI_MODEL }} | ||
| run: | | ||
| MODEL="${AGENTIC_CI_MODEL:-claude-sonnet-4-20250514}" | ||
|
|
||
| # Verify claude is installed and reachable | ||
| if ! command -v claude &> /dev/null; then | ||
| echo "::error::claude CLI not found in PATH" | ||
| exit 1 | ||
| fi | ||
|
|
||
| echo "Claude CLI version: $(claude --version 2>&1 || true)" | ||
|
|
||
| # Run a minimal prompt to verify auth + model + tool usage work end-to-end | ||
| RESULT=$(claude \ | ||
| --model "$MODEL" \ | ||
| -p "Reply with exactly: HEALTH_CHECK_OK" \ | ||
| --max-turns 1 \ | ||
| --output-format text \ | ||
| 2>&1) || { | ||
| echo "::error::Claude CLI failed" | ||
| echo "$RESULT" | ||
| exit 1 | ||
| } | ||
|
|
||
| echo "Claude response: ${RESULT}" | ||
|
|
||
| if echo "$RESULT" | grep -q "HEALTH_CHECK_OK"; then | ||
| echo "Claude CLI health check passed" | ||
| else | ||
| echo "::warning::Claude responded but output was unexpected" | ||
| fi | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan calls for tool agnostic (claude-code/codex), but we only handle Anthropic here. I the plan to update this later when we swtich/support codex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the health probe is Claude-specific on purpose since that's what we're deploying first. The recipe format itself is tool-agnostic (the
toolfrontmatter field), but workflow glue is necessarily tied to whichever tool it's running. Codex support would come in Phase 4.