investigation(Job/gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c): false positive - no infrastructure issue by k8s-mendabot[bot] · Pull Request #1600 · lenaxia/talos-ops-prod

k8s-mendabot · 2026-04-13T21:27:15Z

Summary

This PR documents an investigation of a failed Job detected by mechanic. After thorough analysis, this finding is a false positive - there is no infrastructure or GitOps configuration issue to fix. The Job is a GitHub Actions workflow step that correctly reported security vulnerabilities found by gosec.

Finding

Kind: Job
Resource: gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c
Namespace: actions-runner-system
Parent: Job/gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c
Fingerprint: 5670d2926db6

Evidence

Job Details

Image: securego/gosec:2.22.3
Command: -fmt sarif -out gosec-results.sarif ./...
Exit Code: 1
BackoffLimit: 0
TTLSecondsAfterFinished: 300

Pod State

Pod gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c-jwghc terminated with exit code 1
Container ran for 7 minutes (21:15:11 to 21:22:12)
PVC gokore-runner-9zx7q-runner-6cc6m-work was created (25Gi) and is now in Terminating state
No scheduling failures or infrastructure errors observed

Infrastructure Health

Actions Runner Controller pod running (7d uptime)
gokore-runner listener pod running (7d uptime)
Multiple other runner pods running successfully
HelmRelease gokore-runner is functioning correctly

GitOps Configuration

The gokore runner is configured in /workspace/repo/kubernetes/apps/actions-runner-system/actions-runner-controller/runners/gokore/helmrelease.yaml:

Uses GitHub Actions Runner Scale Set Controller
minRunners: 0, maxRunners: 15
Container mode: kubernetes
Storage: 25Gi PVC with openebs-hostpath StorageClass
Runner pod scheduled on worker nodes

Root Cause

This Job is not a Kubernetes infrastructure component. It is a GitHub Actions workflow step that was dynamically created by the Actions Runner Controller to run a gosec security scan on the goKore repository.

The Job "failed" because gosec correctly identified security vulnerabilities in the codebase and exited with code 1. This is expected and correct behavior for a security scanner:

Exit code 0: No vulnerabilities found
Exit code 1: Vulnerabilities found (job marked as failed in CI/CD)

The backoffLimit: 0 is set by the GitHub Actions workflow definition (in the goKore repository), not by the GitOps configuration. This is appropriate for security scans - you don't want to automatically retry when vulnerabilities are found.

Fix

No fix required in the GitOps repository.

The infrastructure is working correctly:

The Actions Runner Controller is functioning properly
Runner pods are being created and scheduled correctly
PVCs are being created and cleaned up as expected
The workflow step ran to completion and reported its findings correctly

Recommendations

For mechanic agent: Consider filtering out ephemeral Jobs created by the Actions Runner Controller, or specifically excluding workflow step Jobs from failure detection. These Jobs represent CI/CD workflow execution results, not infrastructure failures.
Alternative: Configure mechanic to only alert on Jobs that are managed by Flux/Helm (have specific labels or annotations) rather than all Jobs in the cluster.

Confidence

Medium - While I am confident there is no infrastructure issue, I recommend human review to determine if this behavior is intentional or if the mechanic agent should be adjusted to handle ephemeral workflow Jobs differently.

Notes

The PVC was successfully created and used, indicating the storage configuration is correct
Other GitHub Actions runners in the same scale set are functioning normally
The TTLSecondsAfterFinished: 300 setting is standard for the Actions Runner Controller
This finding does not indicate any degradation of the runner infrastructure

For Human Reviewers

Please consider:

Should the mechanic agent detect failed ephemeral Jobs created by CI/CD systems?
If not, what criteria should be used to exclude these from monitoring?
Is there any GitOps configuration change that would help reduce false positives?

Opened automatically by mechanic

…cument false positive - no infrastructure issue found

investigation(Job/gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c): do…

ddf6d4c

…cument false positive - no infrastructure issue found

k8s-mendabot Bot added the needs-human-review Requires human review before merging label Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

investigation(Job/gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c): false positive - no infrastructure issue#1600

investigation(Job/gokore-runner-9zx7q-runner-6cc6m-step-9c2d569c): false positive - no infrastructure issue#1600
k8s-mendabot[bot] wants to merge 1 commit intomainfrom
fix/mechanic-5670d2926db6

k8s-mendabot Bot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

k8s-mendabot Bot commented Apr 13, 2026

Summary

Finding

Evidence

Job Details

Pod State

Infrastructure Health

GitOps Configuration

Root Cause

Fix

Recommendations

Confidence

Notes

For Human Reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants