Skip to content

Extract stdout/stderr/cmd/rc from failed tasks in step1 parser#20

Open
PalmPalm7 wants to merge 3 commits intoredhat-et:mainfrom
PalmPalm7:fix/step1-parser-include-stdout-stderr-cmd
Open

Extract stdout/stderr/cmd/rc from failed tasks in step1 parser#20
PalmPalm7 wants to merge 3 commits intoredhat-et:mainfrom
PalmPalm7:fix/step1-parser-include-stdout-stderr-cmd

Conversation

@PalmPalm7
Copy link
Copy Markdown
Contributor

@PalmPalm7 PalmPalm7 commented Mar 26, 2026

The job parser previously only captured res.msg from failed task events, which is often just a generic message like "non-zero return code". Now extracts res.stdout, res.stderr, res.cmd, and res.rc when present, with a fallback to event-level stdout when res fields are missing/suppressed.

Ref: #15

Testing: in progress.

Summary by CodeRabbit

  • New Features

    • Enhanced task failure diagnostics now capture execution metadata including return codes, command arguments, and standard error/output streams for improved troubleshooting and root cause analysis.
  • Tests

    • Added comprehensive test coverage for diagnostic field extraction and fallback parsing logic.

The job parser previously only captured res.msg from failed task events,
which is often just a generic message like "non-zero return code". Now
extracts res.stdout, res.stderr, res.cmd, and res.rc when present, with
a fallback to event-level stdout when res fields are missing/suppressed.

Closes redhat-et#15

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

This pull request extends the root-cause-analysis skill to capture additional diagnostic metadata from failed Ansible tasks. The schema now includes stdout, stderr, cmd, and return code fields, while the job parser implementation adds extraction logic with fallback chains to populate these fields from available event data sources.

Changes

Cohort / File(s) Summary
Schema Updates
skills/root-cause-analysis/schemas/job_context.schema.json
Extended failed_tasks item schema with four new diagnostic properties: stdout (string or array), stderr (string), cmd (string or array), and rc (integer). Preserves existing duration field and constraints.
Parser Implementation
skills/root-cause-analysis/scripts/job_parser.py
Added _parse_result_from_rendered_output() helper to extract JSON from Ansible output. Enhanced _extract_failed_tasks() to populate diagnostic fields from event_data.res with fallback to parsed event stdout when direct fields unavailable.
Test Data & Tests
skills/root-cause-analysis/tests/data/sample_job.json, skills/root-cause-analysis/tests/scripts/test_job_parser.py
Enriched sample job fixture with complete diagnostic metadata (rc, cmd, stdout, stderr). Added comprehensive test cases validating extraction of all new fields and fallback behavior when fields are missing or empty.

Sequence Diagram(s)

sequenceDiagram
    participant AnsibleEvent as Ansible Event
    participant Parser as Job Parser
    participant FallbackLogic as Fallback Chain
    participant Output as Task Info Object

    AnsibleEvent->>Parser: Send event_data.res with diagnostics
    Parser->>Parser: Check res.stdout present?
    alt res.stdout exists and non-empty
        Parser->>Output: Use res.stdout
    else res.stdout missing/empty
        Parser->>FallbackLogic: Parse event["stdout"]
        FallbackLogic->>FallbackLogic: Strip ANSI codes
        FallbackLogic->>FallbackLogic: Extract trailing JSON
        alt JSON parsed successfully
            FallbackLogic->>Output: Populate stdout, stderr, cmd
        else JSON extraction fails
            FallbackLogic->>Output: Use raw event["stdout"]
        end
    end
    
    Parser->>Parser: Extract rc, cmd from res
    Parser->>Output: Populate rc, cmd
    Parser->>Parser: Extract stderr from res
    Parser->>Output: Populate stderr
    Output-->>AnsibleEvent: Return enriched task_info
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: extracting additional diagnostic fields (stdout, stderr, cmd, rc) from failed tasks in the job parser.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

PalmPalm7 and others added 2 commits March 26, 2026 22:13
When Ansible suppresses res fields (e.g. _ansible_no_log), stdout/stderr/cmd
are missing from the structured result but still embedded as a JSON blob in
the ANSI-rendered event.stdout. This adds a fallback parser that extracts
structured fields from the rendered output before falling back to raw stdout.

Fixes: redhat-et#15

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@PalmPalm7
Copy link
Copy Markdown
Contributor Author

Hiding the detailed output since it's a public demo. After re-running Job 2035762, we have :

The parser fix is confirmed working — step1 now outputs structured stdout,
  stderr, cmd, and rc fields instead of a raw ANSI blob, directly addressing              
  redhat-et/rhdp-rca-plugin#15.

@PalmPalm7 PalmPalm7 marked this pull request as ready for review March 26, 2026 14:31
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
skills/root-cause-analysis/scripts/job_parser.py (1)

13-33: ⚠️ Potential issue | 🟠 Major

Add type cast for json.load() returns to match declared return type.

The function declares dict[str, Any] but json.load() returns Any, triggering mypy's warn_return_any check. Cast all three return statements to the declared type.

Fix
 from typing import Any
+from typing import cast
@@
-                return json.load(f)
+                return cast(dict[str, Any], json.load(f))
@@
-            return json.load(f)
+            return cast(dict[str, Any], json.load(f))
@@
-            return json.load(f)
+            return cast(dict[str, Any], json.load(f))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/root-cause-analysis/scripts/job_parser.py` around lines 13 - 33, The
function load_job_log returns dict[str, Any] but calls json.load() which is
typed as Any; update all three return sites to cast the json.load() result to
dict[str, Any] (e.g., using typing.cast) so the return type matches the function
signature: cast the value returned inside the gzip branch (inside the gzip.open
try), the plain JSON branch (open try), and the fallback gzip branch (except
UnicodeDecodeError) to dict[str, Any].
🧹 Nitpick comments (2)
skills/root-cause-analysis/tests/scripts/test_job_parser.py (2)

208-210: Tighten the stdout absence assertion.

Line 209 is overly permissive (or), so unexpected non-empty stdout can still pass. In this fixture path, an exact absence check is clearer and safer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/root-cause-analysis/tests/scripts/test_job_parser.py` around lines 208
- 210, The current assertion for "stdout" uses an OR which allows a non-empty
stdout to slip through; update the check in the test (in
tests/scripts/test_job_parser.py where variable task is asserted) to require the
key is absent exactly — replace the permissive assertion with one that asserts
"stdout" not in task so the test fails if the key is present at all.

236-308: Add a direct regression test for rendered JSON fallback fields.

Current tests validate raw event.stdout fallback and res.stdout priority, but they don’t directly assert extraction of parsed rc/cmd/stderr from rendered output (FAILED! => {...}). Add one case to lock that branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/root-cause-analysis/tests/scripts/test_job_parser.py` around lines 236
- 308, Add a new unit test that covers the fallback where rendered JSON in the
event-level stdout contains the diagnostic fields (e.g., "FAILED! => { ... }")
and those parsed values should populate rc/cmd/stderr when res lacks them:
create a test (similar to existing tests) that constructs an event with event
"runner_on_failed", a res with only msg, and an event-level stdout containing
the "FAILED! => {\"rc\":...,\"cmd\":...,\"stderr\":\"...\"}" payload; call
_extract_failed_tasks and assert that the returned task includes rc, cmd, and
stderr populated from the parsed JSON; use the existing test names as a pattern
(e.g., test_extract_failed_tasks_parsed_rendered_stdout_fallback) and reference
_extract_failed_tasks to locate where behavior is validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/root-cause-analysis/scripts/job_parser.py`:
- Around line 209-216: The fallback that parses rendered output only runs when
"stdout" or "stderr" are missing and only copies ("stdout","stderr","cmd"),
which means missing "cmd" or "rc" can be ignored; update the conditional around
task_info/event to run the fallback when any of the expected fields are missing
(at least "stdout","stderr","cmd","rc"), and expand the loop that copies parsed
values from _parse_result_from_rendered_output(event.get("stdout","")) to
include "rc" as well as "cmd"/"stdout"/"stderr" so task_info gets recovered for
any missing field; adjust the identical logic in the later occurrence the same
way referencing task_info, event, and _parse_result_from_rendered_output.

---

Outside diff comments:
In `@skills/root-cause-analysis/scripts/job_parser.py`:
- Around line 13-33: The function load_job_log returns dict[str, Any] but calls
json.load() which is typed as Any; update all three return sites to cast the
json.load() result to dict[str, Any] (e.g., using typing.cast) so the return
type matches the function signature: cast the value returned inside the gzip
branch (inside the gzip.open try), the plain JSON branch (open try), and the
fallback gzip branch (except UnicodeDecodeError) to dict[str, Any].

---

Nitpick comments:
In `@skills/root-cause-analysis/tests/scripts/test_job_parser.py`:
- Around line 208-210: The current assertion for "stdout" uses an OR which
allows a non-empty stdout to slip through; update the check in the test (in
tests/scripts/test_job_parser.py where variable task is asserted) to require the
key is absent exactly — replace the permissive assertion with one that asserts
"stdout" not in task so the test fails if the key is present at all.
- Around line 236-308: Add a new unit test that covers the fallback where
rendered JSON in the event-level stdout contains the diagnostic fields (e.g.,
"FAILED! => { ... }") and those parsed values should populate rc/cmd/stderr when
res lacks them: create a test (similar to existing tests) that constructs an
event with event "runner_on_failed", a res with only msg, and an event-level
stdout containing the "FAILED! => {\"rc\":...,\"cmd\":...,\"stderr\":\"...\"}"
payload; call _extract_failed_tasks and assert that the returned task includes
rc, cmd, and stderr populated from the parsed JSON; use the existing test names
as a pattern (e.g., test_extract_failed_tasks_parsed_rendered_stdout_fallback)
and reference _extract_failed_tasks to locate where behavior is validated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f7ac6c14-ce1e-43ee-9424-30dc9c770044

📥 Commits

Reviewing files that changed from the base of the PR and between 1b23196 and aa79fcb.

📒 Files selected for processing (4)
  • skills/root-cause-analysis/schemas/job_context.schema.json
  • skills/root-cause-analysis/scripts/job_parser.py
  • skills/root-cause-analysis/tests/data/sample_job.json
  • skills/root-cause-analysis/tests/scripts/test_job_parser.py

Comment on lines +209 to +216
if "stdout" not in task_info or "stderr" not in task_info:
event_stdout = event.get("stdout", "")
if event_stdout:
parsed = _parse_result_from_rendered_output(event_stdout)
if parsed:
for field in ("stdout", "stderr", "cmd"):
if field not in task_info and field in parsed:
task_info[field] = parsed[field]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fallback extraction misses rc and can skip missing cmd/rc.

Line 214 copies only stdout/stderr/cmd, and Line 209 only enters fallback when stdout or stderr is missing. If cmd or rc are the only missing fields, they are never recovered from rendered output.

Proposed fix
-            if "stdout" not in task_info or "stderr" not in task_info:
+            if any(field not in task_info for field in ("stdout", "stderr", "cmd", "rc")):
                 event_stdout = event.get("stdout", "")
                 if event_stdout:
                     parsed = _parse_result_from_rendered_output(event_stdout)
                     if parsed:
-                        for field in ("stdout", "stderr", "cmd"):
-                            if field not in task_info and field in parsed:
-                                task_info[field] = parsed[field]
+                        for field in ("stdout", "stderr", "cmd", "rc"):
+                            value = parsed.get(field)
+                            if field not in task_info and value is not None and value != "":
+                                task_info[field] = value

Also applies to: 219-220

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/root-cause-analysis/scripts/job_parser.py` around lines 209 - 216, The
fallback that parses rendered output only runs when "stdout" or "stderr" are
missing and only copies ("stdout","stderr","cmd"), which means missing "cmd" or
"rc" can be ignored; update the conditional around task_info/event to run the
fallback when any of the expected fields are missing (at least
"stdout","stderr","cmd","rc"), and expand the loop that copies parsed values
from _parse_result_from_rendered_output(event.get("stdout","")) to include "rc"
as well as "cmd"/"stdout"/"stderr" so task_info gets recovered for any missing
field; adjust the identical logic in the later occurrence the same way
referencing task_info, event, and _parse_result_from_rendered_output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant