Skip to content

Commit d9dea62

Browse files
Add ai summary report (#248)
* add ai summary report
1 parent c8567a7 commit d9dea62

File tree

21 files changed

+434
-123
lines changed

21 files changed

+434
-123
lines changed

.github/workflows/build-and-test.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,7 @@ jobs:
143143
with:
144144
report-path: './ctrf-reports/ctrf-report.json'
145145
ai-report: true
146+
ai-summary-report: true
146147
annotate: false
147148
if: always()
148149
skipped-reports-test:

README.md

Lines changed: 8 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ For more advanced usage, there are several inputs available.
114114
insights-report: false
115115
slowest-report: false
116116
ai-report: false
117+
ai-summary-report: false
117118
skipped-report: false
118119
suite-folded-report: false
119120
suite-list-report: false
@@ -179,15 +180,15 @@ with the provider and any optional settings:
179180
uses: ctrf-io/github-test-reporter@v1
180181
with:
181182
report-path: './ctrf/*.json'
182-
github-report: true
183+
ai-summary-report: true
184+
pull-request: true
183185
ai: |
184186
{
185187
"provider": "openai",
186-
"model": "gpt-4",
187-
"temperature": 0.7,
188-
"maxTokens": 2000
188+
"model": "gpt-4"
189189
}
190190
env:
191+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
191192
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
192193
if: always()
193194
```
@@ -223,30 +224,13 @@ All configuration parameters are specified at the root level (all optional excep
223224
"topP": 1, // Nucleus sampling
224225
"maxMessages": 10, // Max failed tests to analyze
225226
"consolidate": true, // Consolidate multiple failures
227+
"additionalPromptContext": "...", // Additional prompt context
228+
"additionalSystemPromptContext": "...", // Additional system prompt context
226229
"log": false, // Enable logging
227230
"deploymentId": "..." // Azure OpenAI deployment ID (Azure only)
228231
}
229232
```
230233

231-
### Example with Claude
232-
233-
```yaml
234-
- name: Publish Test Report with Claude AI
235-
uses: ctrf-io/github-test-reporter@v1
236-
with:
237-
report-path: './ctrf/*.json'
238-
github-report: true
239-
ai: |
240-
{
241-
"provider": "claude",
242-
"model": "claude-3-5-sonnet-20241022",
243-
"maxTokens": 3000
244-
}
245-
env:
246-
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
247-
if: always()
248-
```
249-
250234
## Pull Requests
251235

252236
You can add a pull request comment by using the `pull-request-report` input:
@@ -514,4 +498,4 @@ analyzing test outcomes across multiple platforms becomes more straightforward.
514498
## Support Us
515499

516500
If you find this project useful, consider giving it a GitHub star ⭐ It means a
517-
lot to us.
501+
lot to us.

__tests__/ctrf/report-preparation.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -502,6 +502,7 @@ function createSingleReportInputs(): Inputs {
502502
failedFoldedReport: false,
503503
previousResultsReport: false,
504504
aiReport: false,
505+
aiSummaryReport: false,
505506
skippedReport: false,
506507
suiteFoldedReport: false,
507508
suiteListReport: false,

action.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,12 @@ inputs:
7575
description: 'Include the AI analysis report.'
7676
required: false
7777
default: false
78+
ai-summary-report:
79+
description:
80+
'Include the AI summary report with structured analysis (summary, code
81+
issues, timeout issues, application issues, recommendations)'
82+
required: false
83+
default: false
7884
skipped-report:
7985
description: 'Include the skipped report.'
8086
required: false

badges/coverage.svg

Lines changed: 1 addition & 1 deletion
Loading

ctrf-reports/ctrf-report.json

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,14 @@
9696
}
9797
],
9898
"extra": {
99-
"ai": "The test suite experienced failures due to issues related to timing and network configuration. The first test failed because the expected page title did not match the actual title within the specified timeout, suggesting a potential mismatch in the expected title pattern or insufficient timeout duration. The second test encountered a real network timeout instead of the simulated failure it was designed to test, indicating problems with the network setup or timeout settings in the test environment. These failures point to a need for reviewing and adjusting both the expected outcomes and the test environment configurations to better align with actual application behavior and network conditions."
99+
"ai": "The test suite experienced failures due to issues related to timing and network configuration. The first test failed because the expected page title did not match the actual title within the specified timeout, suggesting a potential mismatch in the expected title pattern or insufficient timeout duration. The second test encountered a real network timeout instead of the simulated failure it was designed to test, indicating problems with the network setup or timeout settings in the test environment. These failures point to a need for reviewing and adjusting both the expected outcomes and the test environment configurations to better align with actual application behavior and network conditions.",
100+
"aiSummary": {
101+
"summary": "Three related test failures in the `addFooterDisplayFlags` function reveal inconsistent logic when handling the `includeFlakyReportAllFooter` flag across different flaky test scenarios with previous suite results. Two tests expect the flag to be `false` but receive `true`, while one expects `true` but receives `false`. These are not intermittent flakiness issues but consistent logic errors that have affected approximately 27% of test runs.",
102+
"code_issues": "• The **addFooterDisplayFlags** function contains contradictory or inverted conditional logic when evaluating whether to set `includeFlakyReportAllFooter` based on flaky test presence across runs and previous results. The function appears to be setting the flag to the opposite of the expected value in multiple scenarios involving flaky test detection with previous suite results.\n• Logic for determining when flaky tests exist \"across all runs\" versus when they don't is either inverted or missing proper condition checks, causing the flag to be enabled when it should be disabled and vice versa in different test scenarios.\n• The combined scenario handling (flaky tests in current AND across all runs) is incorrectly evaluating conditions when merging current results with previous historical data, failing to properly suppress the footer flag when flaky tests are detected.",
103+
"timeout_issues": "",
104+
"application_issues": "• The test suite shows a consistent 27% failure rate across 52 runs for these specific flag-setting scenarios, indicating a persistent, reproducible bug rather than environmental or timing-related flakiness.",
105+
"recommendations": "• Review the **addFooterDisplayFlags** function's conditional logic for setting `includeFlakyReportAllFooter`, specifically the conditions that check for flaky tests across all runs and in combination with previous results.\n• Verify all boolean comparisons and negations in the flaky test detection logic to ensure they are not inverted or contradictory.\n• Add explicit unit tests or debug traces to validate the flaky test count calculations when previous results are included to ensure accurate detection of flaky tests across runs.\n• Ensure the logic correctly distinguishes between three scenarios: (1) flaky tests exist across all runs with previous results, (2) no flaky tests exist across all runs with previous results, and (3) combined current and historical flaky tests, setting the flag appropriately for each case."
106+
}
100107
}
101108
}
102109
}

dist/index.js

Lines changed: 50 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

dist/index.js.map

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

dist/reports/ai-summary-report.hbs

Lines changed: 61 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)