feat: improve pipeline log extraction for catchError and parallel failures by lidiams96 · Pull Request #99 · jenkinsci/explain-error-plugin

lidiams96 · 2026-02-19T06:08:44Z

Problem

The current PipelineLogExtractor has two limitations that cause it to miss real errors in production pipelines:

Single-match Strategy 1: The FlowGraph walk returns on the first ErrorAction+LogAction node found. In pipelines with parallel failures (e.g. multiple test runner pods crashing simultaneously), only the first error is captured and subsequent failures are dropped.
catchError + sh(returnStatus:true) + error() pattern not detected: This common pipeline pattern is invisible to Strategy 1. sh(returnStatus:true) does not throw so it has no ErrorAction. The subsequent error() step records an ErrorAction but has no LogAction (it only carries the message string, not the sh output). Strategy 1 finds the error() node, skips it (no LogAction), and falls back to run.getLog(maxLines) — which returns the last N lines of the full console log and misses the actual error output when the build has many subsequent steps.

Solution

Strategy 1 — multi-collect: Instead of returning on the first ErrorAction+LogAction match, accumulate logs from all matching nodes up to maxLines, using a HashSet to deduplicate by origin node ID (reverse-chronological walk may visit the same origin via multiple FlowGraph paths).

Strategy 2 (new) — WarningAction walk: Runs only when Strategy 1 finds nothing. Walks the FlowGraph looking for step nodes with a LogAction enclosed by a BlockStartNode carrying a WarningAction worse than SUCCESS. This handles pipeline variants where catchError records a WarningAction on the block's start node rather than an ErrorAction on the failing step. Requires ≥ 2 lines matching the error pattern to avoid capturing CI infrastructure steps (e.g. a curl command whose JSON payload happens to contain "failed") ahead of the actual failing step log.

Strategy 3 — always-on supplement: The error pattern scan previously ran only as a final fallback when all other strategies produced nothing. It now always runs after Strategies 1 and 2, filling the remaining maxLines budget with lines from the full console log that match error keywords (error, exception, failed, fatal) plus ±5 lines of surrounding context. Lines already captured by Strategies 1 or 2 are skipped via HashSet deduplication. This reliably surfaces the catchError+sh(returnStatus:true)+error() pattern and catches errors buried early in large build logs.

Alternatives

Keeping Strategy 3 as fallback only: The existing approach worked for simple pipelines but missed the catchError+error() case where Strategy 1 finds an ErrorAction node with no usable log. The supplement approach avoids redundant I/O when Strategies 1/2 already filled the budget and adds no overhead when Strategy 3 has nothing new to contribute.

Additional Context

Validated against a real production Jenkins pipeline (complex parallel build with thousands of tests across multiple parallel groups, ~30,000 line build log):

Before: the plugin fell back to the last N lines and surfaced unrelated warnings instead of the actual failing step
After: the plugin correctly surfaces the error from inside the catchError block

The 5 new integration tests in PipelineLogExtractorTest cover all three strategies and include an end-to-end test that verifies the AI provider receives the inner catchError content rather than the generic fallback log.

…lures - Strategy 1 now multi-collects: walks the FlowGraph and accumulates logs from ALL ErrorAction+LogAction nodes (not just the first match) using a HashSet to deduplicate origins. This captures parallel failures such as multiple RSpec pod crashes in a single pass. - Strategy 2 (new): WarningAction walk, runs only when Strategy 1 finds nothing. Looks for step nodes enclosed by a BlockStartNode with a WarningAction worse than SUCCESS (e.g. catchError with stageResult:FAILURE). Requires >= 2 lines matching the error pattern to avoid returning CI infrastructure steps ahead of the actual failing step log. - Strategy 3 (error pattern scan) now always runs as a supplement after Strategies 1 and 2, filling the remaining maxLines budget with deduplicated lines matching error patterns from the full console log. Previously it only ran as a final fallback. This fixes the catchError+sh(returnStatus:true)+ error() pattern where error() has an ErrorAction but no LogAction. - Added ERROR_PATTERN constant with universal keywords (error, exception, failed, fatal) and ERROR_CONTEXT_LINES = 5 context lines around each match. - Added getErrorPatternLines() method that reads the full build log, strips ConsoleNote annotations, marks matching lines with surrounding context, and returns up to maxLines results. - Extended PipelineLogExtractorTest with 5 integration tests covering: Strategy 1 standard failure, the catchError+returnStatus pattern (Strategy 3), early errors in large logs (Strategy 3), multi-error builds (Strategies 1 and 3), and an end-to-end test verifying the AI provider receives the inner catchError content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Strategy 3 calls run.getLogInputStream() which may return null in some environments (e.g. when mocked). A null InputStream passed to InputStreamReader throws NullPointerException, which is not caught by the existing catch(IOException). Add an explicit null check to return an empty list instead. Also add the missing getLogInputStream() stub in testNullFlowExecutionFallsBackToBuildLog so the mock correctly models a build where the log stream is available but empty, allowing Strategy 3 to run without error and fall through to the getLog() fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@nonnull

Run.getLogInputStream() is @nonnull per the Jenkins API contract, so the null check introduced in the previous commit is redundant and is correctly flagged by SpotBugs as RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE. Restore the original try-with-resources structure. The test stub (when(mockRun.getLogInputStream()).thenReturn(InputStream.nullInputStream())) remains to satisfy Mockito's default behaviour, which returns null for unstubbed methods regardless of @nonnull annotations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Improves PipelineLogExtractor so the plugin captures more accurate failure logs in complex pipelines (parallel failures and catchError + sh(returnStatus:true) + error() patterns), reducing reliance on the last-N-lines fallback and better surfacing real error output for AI analysis.

Changes:

Update pipeline FlowGraph extraction to collect logs from multiple failing nodes and add a WarningAction-based walk.
Add an always-on console-log error-pattern supplement to fill remaining maxLines.
Add new Jenkins integration tests covering key extraction scenarios and an end-to-end verification via TestProvider.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
`src/main/java/io/jenkins/plugins/explain_error/PipelineLogExtractor.java`	Implements multi-strategy log extraction (multi-collect, WarningAction walk, pattern-scan supplement) and deduping.
`src/test/java/io/jenkins/plugins/explain_error/PipelineLogExtractorTest.java`	Adds integration + end-to-end tests validating improved extraction behavior in realistic Pipeline scenarios.

src/main/java/io/jenkins/plugins/explain_error/PipelineLogExtractor.java

Copilot · 2026-02-19T09:35:59Z

src/test/java/io/jenkins/plugins/explain_error/PipelineLogExtractorTest.java

+        WorkflowJob job = jenkins.createProject(WorkflowJob.class, "test-strategy1");
+        job.setDefinition(new CpsFlowDefinition(
+                "node {\n"
+                + "    sh 'echo \"STANDARD_ERROR_OUTPUT\" && exit 1'\n"
+                + "}",


These integration tests rely on the Pipeline sh step, which fails on Windows ("sh is only supported on Unix"). If CI or downstream builds run tests on Windows, consider guarding with an assumption/skip when not Unix, or using a cross-platform approach (e.g., bat when on Windows) so the suite remains portable.

Fixed in commit 7cc4b8d. All five tests that use the Pipeline sh step are now annotated with @DisabledOnOs(OS.WINDOWS), which skips them on Windows CI while allowing them to run on all other Unix-like platforms (Linux, Mac, etc.).

Note: OS.UNIX was removed in JUnit Jupiter 6.x (only LINUX, MAC, WINDOWS, etc. remain), so @DisabledOnOs(OS.WINDOWS) is the correct portable alternative — it covers all non-Windows platforms without listing them explicitly.

Copilot · 2026-02-19T09:35:59Z

src/test/java/io/jenkins/plugins/explain_error/PipelineLogExtractorTest.java

+ *       has both ErrorAction and LogAction (e.g. {@code sh 'exit 1'}).</li>
+ *   <li>Strategy 2 — WarningAction walk: step nodes enclosed by a catchError block whose
+ *       BlockStartNode carries a WarningAction. Triggers when Jenkins CPS records the
+ *       WarningAction on the block's start node (pipeline-variant dependent).</li>
+ *   <li>Strategy 3 — Error pattern scan: reads the full console log and returns lines


Strategy 2 (WarningAction walk) is described here and implemented in PipelineLogExtractor, but this test suite doesn’t include a scenario that forces Strategy 1 to return empty and then asserts Strategy 2 behavior. Consider adding a dedicated integration test for the WarningAction-only variant so Strategy 2 is regression-protected.

Added in commit 808a249: strategy2_catchErrorWithWarningAction_extractsStepLog tests the scenario where catchError(buildResult:'FAILURE', stageResult:'FAILURE') wraps a direct sh failure, causing the BlockStartNode to receive a WarningAction worse than SUCCESS.

One note worth documenting here: Strategy 2 cannot be fully isolated from Strategy 1 in a black-box integration test. When sh 'exit 1' fails inside catchError, Jenkins CPS may or may not attach ErrorAction+LogAction to the sh FlowNode depending on the version and variant — if it does, Strategy 1 finds the content first (Strategy 2 never runs). The test therefore asserts that the content is extracted by Strategy 1 or Strategy 2, protecting against regressions in the WarningAction walk code path without making assumptions about which strategy fires in a given Jenkins version.

- Strategy 1: pass remaining budget to readLimitedLog() instead of always using maxLines, so later failure nodes still get log lines when earlier ones partially fill the budget - getErrorPatternLines(): replace load-all-then-scan with a streaming approach using a bounded circular pre-context buffer; avoids loading the full console log into memory for large builds - Strategy 3 dedup: change Set.add() to Set.contains() so repeated occurrences of the same error line (e.g. retries) are preserved when they appear in different contexts, not just deduplicated globally - Lower Strategy 3 completion log from INFO to FINE to reduce noise in Jenkins logs - Add @EnabledOnOs(OS.UNIX) to all Pipeline sh-step tests so they are skipped on Windows CI rather than failing with an unsupported step error Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

JUnit Jupiter 6.0.1 removed the OS.UNIX constant (only AIX, FREEBSD, LINUX, MAC, OPENBSD, SOLARIS, WINDOWS, OTHER remain). Replace all @EnabledOnOs(OS.UNIX) annotations with @DisabledOnOs(OS.WINDOWS) which is the correct equivalent for skipping sh-based Pipeline tests on Windows. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ion test - Expand ERROR_PATTERN to match plural forms: errors? and exceptions? so that common log lines like "2 errors found" or "exceptions thrown" are captured by Strategies 2 and 3 (Strategy 2 filter requires ≥2 error-pattern matches, so plurals directly affect its sensitivity) - Add strategy2_catchErrorWithWarningAction_extractsStepLog integration test: verifies that a direct sh failure inside catchError with buildResult/stageResult:'FAILURE' (which gives the BlockStartNode a WarningAction) is captured by Strategy 1 or 2, protecting against regressions in the WarningAction walk code path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lidiams96 requested a review from a team as a code owner February 19, 2026 06:08

lidiams96 force-pushed the feat/improve-pipeline-log-extraction branch from eec6a39 to 22999ea Compare February 19, 2026 06:29

lidiams96 and others added 2 commits February 19, 2026 07:54

shenxianpeng requested review from Copilot and removed request for a team February 19, 2026 09:30

shenxianpeng added the enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted label Feb 19, 2026

Copilot started reviewing on behalf of shenxianpeng February 19, 2026 09:31 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

lidiams96 and others added 4 commits February 19, 2026 10:52

ci: retrigger checks after @DisabledOnOs(OS.WINDOWS) fix

bb35e6a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: improve pipeline log extraction for catchError and parallel failures#99

feat: improve pipeline log extraction for catchError and parallel failures#99
lidiams96 wants to merge 7 commits intojenkinsci:mainfrom
lidiams96:feat/improve-pipeline-log-extraction

lidiams96 commented Feb 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

lidiams96 Feb 19, 2026

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

lidiams96 Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Uh oh!

Conversation

lidiams96 commented Feb 19, 2026

Problem

Solution

Alternatives

Additional Context

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

lidiams96 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

lidiams96 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments