This guide explains how the LISA MCP server uses Azure OpenAI to automatically analyze test failures, determine root causes, and generate rich reports.
- Overview — how it works
- Prerequisites
- The 4 analysis tools
- Step-by-step: analyze a test run
- Understanding the output
- Reading the HTML report
- Root cause categories
- Failure severity levels
- Log collection — how context is gathered
- End-to-end pipeline: run_and_analyze
- API cost guidance
- CI/CD integration
LISA test run output
│
▼
┌─────────────────────────────────────┐
│ Step 1 — Parse results │
│ parse_results() → TestRunSummary │
│ JUnit XML or console output │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Step 2 — Collect log context │
│ collect_run_logs() │
│ Tail-reads per-test .log files │
│ Extracts lines near ERROR/FAIL │
│ Hard cap: 256 KB per file │
│ 8,000 chars per test │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Step 3 — Per-failure analysis │
│ For each failed test: │
│ analyze_failure() → Azure OpenAI API │
│ Uses tool_use for structured JSON │
│ Returns FailureAnalysis: │
│ • root_cause_category │
│ • root_cause_description │
│ • recommended_fix │
│ • severity (critical/high/…) │
│ • relevant_log_lines │
│ • confidence (0.0–1.0) │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Step 4 — Run-level summary │
│ analyze_run() → Azure OpenAI API │
│ Sends compact digest (not logs) │
│ Returns RunAnalysisSummary: │
│ • overall_health │
│ • health_score │
│ • failure_patterns │
│ • top_priorities │
│ • recommendations │
│ • executive_summary │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Step 5 — Report generation │
│ generate_html_report() │
│ generate_markdown_report() │
│ Self-contained HTML (no CDN) │
│ GitHub-flavored Markdown │
└─────────────────────────────────────┘
Token usage per run:
- Per failure: ~1,000–2,000 input tokens + ~500 output tokens
- Run summary: ~800 input tokens + ~800 output tokens
- 10 failures ≈
15,000–25,000 tokens total ($0.05–0.15 at current pricing)
- Go to https://portal.azure.com
- Create an account or sign in
- Navigate to API Keys → Create Key
- Copy the key (starts with
YOUR_AZURE_OPENAI_API_KEY)
The key is passed directly to the tools — it is never stored on disk by the MCP server.
If you only want to analyze existing results, no LISA installation is needed.
For end-to-end run_and_analyze, LISA must be installed:
cd ~/lisa
pip install -e .
lisa --version| Tool | Input | Output | Use case |
|---|---|---|---|
analyze_test_run_with_llm |
Results + optional run dir | JSON with all analyses | Analyze results you already have |
analyze_failure_root_cause |
Single test name + error | JSON with FailureAnalysis | Deep-dive on one specific failure |
generate_analysis_report |
Results + output dir | HTML + Markdown + JSON | Full report to share with team |
run_and_analyze |
Runbook + lisa_path | All of the above | One-shot: run tests and get report |
I have LISA test results at ~/lisa/lisa_results.xml and LISA logs at ~/lisa/runtime/latest.
My Azure OpenAI API key is YOUR_AZURE_OPENAI_API_KEY
Analyze the failures and generate a report at ~/reports/
The AI calls generate_analysis_report:
generate_analysis_report(
results_source="~/lisa/lisa_results.xml",
api_key="YOUR_AZURE_OPENAI_API_KEY",
output_dir="~/reports/",
run_dir="~/lisa/runtime/latest",
report_base_name="my_run",
)Returns:
{
"html_path": "/home/user/reports/my_run.html",
"markdown_path": "/home/user/reports/my_run.md",
"report": { ... }
}Open my_run.html in a browser for the full visual report.
Here is my LISA run output. Analyze the failures with my API key YOUR_AZURE_OPENAI_API_KEY:
[PASS] Provisioning.smoke_test (12.3s)
[FAIL] StorageVerification.nvme_io_test (120.5s)
[FAIL] NetworkTest.verify_sriov (45.0s)
[PASS] CoreTest.verify_cpu_count (2.8s)
The AI calls analyze_test_run_with_llm:
analyze_test_run_with_llm(
results_source="[PASS] Provisioning...\n[FAIL] StorageVerification...",
api_key="YOUR_AZURE_OPENAI_API_KEY",
)The test StorageVerification.nvme_io_test failed with:
"Expected exit code 0 but got 1"
The log is at ~/lisa/logs/StorageVerification/nvme_io_test/console.log
Analyze with API key YOUR_AZURE_OPENAI_API_KEY
The AI calls analyze_failure_root_cause:
analyze_failure_root_cause(
test_name="StorageVerification.nvme_io_test",
failure_message="Expected exit code 0 but got 1",
api_key="YOUR_AZURE_OPENAI_API_KEY",
log_file_path="~/lisa/logs/StorageVerification/nvme_io_test/console.log",
)Run the runbook at ~/runbooks/ubuntu22_t1.yml using LISA at ~/lisa.
Pass subscription_id=xxxx and admin_private_key_file=~/.ssh/lisa_key.
After running, analyze all failures with Azure OpenAI API key YOUR_AZURE_OPENAI_API_KEY
and save the report to ~/reports/ubuntu22_t1/
The AI calls run_and_analyze — fully automated end-to-end.
{
"test_name": "StorageVerification.nvme_io_test",
"root_cause_category": "disk_io_error",
"root_cause_description": "The NVMe device /dev/nvme0 was not found during test
execution. The PCIe enumeration log shows a timeout at boot, suggesting the
Azure VM SKU (Standard_D4s_v3) does not provide NVMe storage.",
"recommended_fix": "Switch to an Lsv3 or Lv3 series VM for NVMe tests.
Verify with: az vm list-skus --location westus3 | grep nvme
Alternatively, check if the VM has NVMe: ls /dev/nvme*",
"severity": "critical",
"relevant_log_lines": [
"[ 2.345] pcieport: PCIe Bus Error: severity=Corrected",
"ERROR: NVMe device not found at /dev/nvme0",
"FAILED: assert_that('/dev/nvme0').exists() -> False"
],
"confidence": 0.88
}{
"overall_health": "critical",
"health_score": 0.72,
"failure_patterns": [
"NVMe storage not enumerated on Standard_D4s_v3",
"SR-IOV VF interface timeout on accelerated networking"
],
"top_priorities": [
"Switch VM SKU to Lsv3 series for all NVMe tests",
"Investigate accelerated networking VF timeout",
"Review kernel version for SR-IOV fixes"
],
"environment_issues": [
"Standard_D4s_v3 does not support NVMe — use Lsv3/Lv3"
],
"recommendations": [
"Update runbook to use Standard_L8s_v3 for storage tests",
"Pin kernel to 5.15.0-1040-azure or later",
"Add retry logic to SR-IOV VF detection (30s retry, 3 attempts)"
],
"executive_summary": "This T1 test run achieved a 72% pass rate with 5 failures,
2 of which are critical blockers. The NVMe storage failures are caused by the
wrong VM SKU being used — a simple runbook fix will resolve 8 tests. The SR-IOV
network failure appears related to a known driver timing issue in kernel 5.14.
Recommend updating the runbook VM configuration and upgrading the kernel before
the next run."
}The HTML report (lisa_analysis.html) has these sections:
- Overall health badge: HEALTHY / DEGRADED / CRITICAL / UNKNOWN
- Health score percentage (weighted by severity)
- Run directory and generation timestamp
Total tests · Passed · Failed · Skipped · Duration
Visual progress bar showing pass rate.
3-5 sentence non-technical summary for stakeholders. Safe to paste into a Jira ticket or email.
One card per failed test, sorted by severity (critical first):
- Badge: severity level (CRITICAL / HIGH / MEDIUM / LOW)
- Category badge: root cause category
- Confidence: how certain The model is (0–100%)
- Root Cause: technical explanation
- Recommended Fix: specific actionable step
- Log Lines: most relevant log lines in a code block
Numbered list of the most important issues to fix first.
Specific team actions (updating runbooks, fixing code, changing configs).
Tag cloud of recurring themes across multiple failures.
Problems that are not test code bugs — VM SKU, quota, network config.
| Category | When used |
|---|---|
kernel_panic |
Kernel crash, oops, BUG() call |
network_timeout |
SSH timeout, ping failure, SR-IOV VF timeout |
disk_io_error |
NVMe not found, I/O error, filesystem unmount |
permission_denied |
chmod issues, sudo failures, SELinux denials |
package_not_found |
apt/dnf install failure, binary not in PATH |
service_crash |
systemd service failed, process died unexpectedly |
assertion_failure |
Test assertion failed (expected != actual) |
timeout |
Test exceeded timeout, hung process |
environment_setup |
cloud-init failure, wrong VM SKU, missing feature |
flaky_test |
Race condition, intermittent timing issue |
infrastructure |
Azure quota, VM provisioning failure, network glitch |
unknown |
The model couldn't determine the root cause from available data |
| Severity | Meaning | Action |
|---|---|---|
critical |
Blocks release; data loss or system instability risk | Fix before merge/publish |
high |
Major feature completely broken | Fix before next release |
medium |
Partial functionality impacted | Fix within sprint |
low |
Minor issue, informational | Fix when convenient |
Severity determines card color in the HTML report and sort order.
When run_dir is provided, the log collector:
- Walks standard LISA output directories:
<run_dir>/logs/<SuiteName>/<test_method>/<run_dir>/runtime/<timestamp>/<test_name>/
-
Tail-reads each
.logfile (last 256 KB only — never loads whole file) -
Scans for lines matching these error patterns:
ERROR, FAILED, EXCEPTION, Traceback, AssertionError,
CRITICAL, PANIC, exit code [non-zero], returncode, timeout,
permission denied, connection refused, no such file
-
Extracts 15 lines of context before and after each error signal
-
Caps output at 8,000 characters per test (≈ 150 lines)
-
Sends the context snippet to the AI alongside the failure message
If run_dir contains no log files:
- Analysis still works using just the failure message and stack trace
- Add
run_dir=Noneexplicitly to skip log collection - The AI will note low confidence when logs are insufficient
run_and_analyze ties everything together:
1. run_tests(lisa_path, runbook_path, variables)
│
▼
2. Find results: look for lisa_results.xml in
- <lisa_path>/runtime/<latest>/
- <lisa_path>/runs/<latest>/
- Current working directory
Fallback: use stdout as results_source
│
▼
3. generate_analysis_report(results_source, api_key, output_dir, run_dir)
│
▼
4. Return: {
run_result: { success, returncode, command },
summary_line: "CRITICAL | 13/18 passed (72.2%) | 5 failed",
html_path: "/path/to/lisa_analysis.html",
markdown_path: "/path/to/lisa_analysis.md",
report: { ... full AnalysisReport ... }
}
Run the runbook ~/runbooks/rhel9_t1.yml with:
- LISA at ~/lisa
- subscription_id: xxxx
- admin_private_key_file: ~/.ssh/lisa_key
- Azure OpenAI API key: YOUR_AZURE_OPENAI_API_KEY
Save reports to ~/reports/rhel9_t1/
The AI calls run_and_analyze and returns:
✅ Run complete.
Health: DEGRADED | 15/18 passed (83.3%) | 3 failed
Reports:
HTML: ~/reports/rhel9_t1/lisa_analysis.html
Markdown: ~/reports/rhel9_t1/lisa_analysis.md
Top failures:
1. [CRITICAL] CoreTest.verify_kdump — kdump service not found on RHEL 9.2
2. [HIGH] NetworkTest.verify_sriov — SR-IOV VF timeout
3. [MEDIUM] StorageTest.verify_swap — swap not enabled by default
Executive summary:
83% of T1 tests passed with 3 failures. The kdump failure is critical for
kernel crash collection and should be fixed before image publication...
Approximate costs (February 2026 pricing for gpt-4o):
| Scenario | Failures | Approx tokens | Approx cost |
|---|---|---|---|
| T0 smoke run | 1–3 | ~5,000 | ~$0.01 |
| T1 daily CI | 3–10 | ~15,000–30,000 | ~$0.05–0.10 |
| T2 weekly regression | 5–20 | ~25,000–60,000 | ~$0.08–0.20 |
| T4 full certification | 20+ | ~80,000+ | cap at 20 failures |
Use max_failures_to_analyze to cap the number of LLM calls:
analyze_test_run_with_llm(
results_source="lisa_results.xml",
api_key="YOUR_AZURE_OPENAI_API_KEY",
max_failures_to_analyze=5, # only analyze the 5 worst failures
)The tool analyzes failures in the order they appear in the results file. Sort by severity in post-processing if needed.
- name: Run LISA T1 tests
run: |
lisa -r runbook.yml \
-v subscription_id=${{ secrets.AZURE_SUB_ID }} \
-v admin_private_key_file=/tmp/lisa_key
- name: Analyze failures with the AI
if: always() # run even if tests failed
run: |
python3 -c "
import json
from lisa_mcp.tools.result_parser import parse_junit_xml
from lisa_mcp.server import generate_analysis_report
result = generate_analysis_report(
results_source='./lisa_results.xml',
api_key='${{ secrets.AZURE_OPENAI_API_KEY }}',
output_dir='./reports/',
run_dir='.',
)
data = json.loads(result)
report = data['report']
health = report['summary']['overall_health'].upper()
print(f'Health: {health}')
print(f'Summary: {report[\"summary\"][\"executive_summary\"]}')
"
- name: Upload analysis report
if: always()
uses: actions/upload-artifact@v4
with:
name: lisa-analysis-${{ github.run_id }}
path: reports/import json
import subprocess
# Get the analysis
result = json.loads(generate_analysis_report(...))
report = result["report"]
summary = report["summary"]
# Build comment body
comment = f"""## LISA Test Analysis — {summary['overall_health'].upper()}
**Health score:** {int(summary['health_score'] * 100)}%
**Passed:** {report['passed']}/{report['total']}
### Executive Summary
{summary['executive_summary']}
### Top Priorities
""" + "\n".join(f"- {p}" for p in summary['top_priorities'][:3])
# Post via gh CLI
subprocess.run([
"gh", "pr", "comment", "--body", comment
])