fix(154): infographic quality — extract risk metrics, update Gemini model config

davidmatousek · claude · davidmatousek · commit 3cd5d27edde4 · 2026-04-12T00:47:04.000-04:00
Two fixes for degraded infographic quality observed in downstream projects.

1. Extract risk summary metrics from compensating-controls.md Section 1

   tachi_parsers.py: parse_compensating_controls_md() now extracts
   risk_reduction (e.g., 22.9), inherent_score (270.3), residual_score
   (208.5), and control_coverage_pct (26.0) from the Executive Summary
   "Risk Reduction" and "Coverage" lines.

   extract-infographic-data.py: pass these four fields through to
   template_data for both baseball-card and risk-funnel templates. The
   infographic agent now has the quantitative data needed to construct
   rich, specific Gemini prompts instead of the vague "0% reduction"
   prompts that produced flat, schematic images.

2. Update Gemini model config with fallback chain

   gemini-prompt-construction.md: change default_model from
   gemini-3-pro-image-preview (preview, may not be accessible) to
   gemini-2.5-flash-image (GA stable). Add a fallback_chain that tries
   gemini-3-pro-image-preview first (highest quality), then
   gemini-3.1-flash-image-preview, then gemini-2.5-flash-image. Document
   model aliases (nano-banana family) and clarify that -image suffix
   models are distinct from base text models.

Golden baselines regenerated for baseball-card and risk-funnel templates.
47/47 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.claude/skills/tachi-infographics/references/gemini-prompt-construction.md b/.claude/skills/tachi-infographics/references/gemini-prompt-construction.md
@@ -121,13 +121,28 @@ Apply these labels in the Gemini prompt based on `metadata.data_source_type`:
 
 ```yaml
 gemini_config:
-  default_model: "gemini-3-pro-image-preview"
+  default_model: "gemini-2.5-flash-image"
+  fallback_chain:
+    - "gemini-3-pro-image-preview"      # Highest quality (preview — may not be available on all API keys)
+    - "gemini-3.1-flash-image-preview"   # Fast, production-scale (preview)
+    - "gemini-2.5-flash-image"           # Stable GA — broadest availability, reliable fallback
   resolution: "2K"
 ```
 
-- **default_model**: The primary Gemini model for image generation. Configurable -- do not hardcode.
+- **default_model**: The GA-stable Gemini model for image generation. Use this when preview models are unavailable. Configurable -- do not hardcode.
+- **fallback_chain**: Try models in order. Preview models (`-preview` suffix) produce higher quality output but may not be accessible on all API keys or regions. The agent should attempt the first available model and fall back through the chain on 404 or model-not-found errors.
 - **resolution**: Target output resolution. "2K" produces images at approximately 1920x1080 for 16:9 aspect ratio.
 
+**Model aliases** (for reference — these are NOT model IDs, just human-friendly names):
+
+| Alias | Model ID | Status | Best For |
+|-------|----------|--------|----------|
+| nano-banana | `gemini-2.5-flash-image` | **Stable (GA)** | Reliable fallback, broad availability |
+| nano-banana-2 | `gemini-3.1-flash-image-preview` | Preview | Speed-optimized production workflows |
+| nano-banana-pro | `gemini-3-pro-image-preview` | Preview | Highest quality, best text rendering |
+
+**IMPORTANT**: The `-image` and `-image-preview` suffixed models are DIFFERENT model IDs from the base text models. `gemini-2.5-flash` (text) does NOT support image generation output — you must use `gemini-2.5-flash-image` (with the `-image` suffix). The standard `models.list` API endpoint may not show preview models; their absence does not mean they are unavailable for `generateContent` calls.
+
 ---
 
 ## Image Generation Parameters
@@ -139,7 +154,7 @@ gemini_config:
 POST https://generativelanguage.googleapis.com/v1beta/models/{model_id}:generateContent
 ```
 
-Where `{model_id}` is the configured model (default: `gemini-3-pro-image-preview`).
+Where `{model_id}` is the configured model. Try the fallback chain in order: `gemini-3-pro-image-preview` first (highest quality), then `gemini-3.1-flash-image-preview`, then `gemini-2.5-flash-image` (GA stable). On a 404 or model-not-found error, move to the next model in the chain.
 
 **Request Headers**:
 ```
diff --git a/scripts/extract-infographic-data.py b/scripts/extract-infographic-data.py
@@ -1042,10 +1042,24 @@ def compute_risk_funnel(tier, severity, threats_content, artifacts,
     # --- T031: Missing enrichments ---
     missing_enrichments = _compute_missing_enrichments(artifacts)
 
+    # --- Score-based risk metrics from compensating-controls Section 1 ---
+    risk_metrics = {
+        "risk_reduction": None,
+        "inherent_score": None,
+        "residual_score": None,
+        "control_coverage_pct": None,
+    }
+    if cc_data:
+        risk_metrics["risk_reduction"] = cc_data.get("risk_reduction")
+        risk_metrics["inherent_score"] = cc_data.get("inherent_score")
+        risk_metrics["residual_score"] = cc_data.get("residual_score")
+        risk_metrics["control_coverage_pct"] = cc_data.get("control_coverage_pct")
+
     return {
         "funnel_tiers": funnel_tiers,
         "reduction_percentages": reduction_percentages,
         "missing_enrichments": missing_enrichments,
+        **risk_metrics,
     }
 
 
@@ -1521,7 +1535,14 @@ def main():
     # Build template-specific data
     template_data = {}
     if args.template == "baseball-card":
-        template_data = {"risk_weights": risk_weights}
+        # Risk metrics from compensating-controls Section 1 (Tier 1 only)
+        template_data = {
+            "risk_weights": risk_weights,
+            "risk_reduction": cc_data.get("risk_reduction") if cc_data else None,
+            "inherent_score": cc_data.get("inherent_score") if cc_data else None,
+            "residual_score": cc_data.get("residual_score") if cc_data else None,
+            "control_coverage_pct": cc_data.get("control_coverage_pct") if cc_data else None,
+        }
     elif args.template == "system-architecture":
         arch_overlay = compute_architecture_overlay(scope, findings, tier, heat_map)
         template_data = {
@@ -1538,6 +1559,10 @@ def main():
             "funnel_tiers": funnel["funnel_tiers"],
             "reduction_percentages": funnel["reduction_percentages"],
             "missing_enrichments": funnel["missing_enrichments"],
+            "risk_reduction": funnel.get("risk_reduction"),
+            "inherent_score": funnel.get("inherent_score"),
+            "residual_score": funnel.get("residual_score"),
+            "control_coverage_pct": funnel.get("control_coverage_pct"),
         }
     elif args.template == "maestro-stack":
         # Per-layer finding summaries: up to 2 top findings per layer
diff --git a/scripts/tachi_parsers.py b/scripts/tachi_parsers.py
@@ -647,13 +647,36 @@ def parse_compensating_controls_md(content: str) -> dict:
         "controls": [],
         "coverage_summary": {"total-found": 0, "total-partial": 0, "total-missing": 0},
         "severity": {"critical": 0, "high": 0, "medium": 0, "low": 0, "note": 0, "total": 0},
+        "risk_reduction": None,      # e.g. 22.9
+        "inherent_score": None,      # e.g. 270.3
+        "residual_score": None,      # e.g. 208.5
+        "control_coverage_pct": None, # e.g. 26.0 (Found percentage)
     }
 
     if not content or not content.strip():
         return result
 
     lines = content.split("\n")
 
+    # ---- Section 1: Executive Summary risk metrics ----
+    # Parse: **Risk Reduction**: 270.3 inherent -> 208.5 residual (**22.9%** reduction)
+    rr_match = re.search(
+        r"\*\*Risk Reduction\*\*:\s*([\d.]+)\s*inherent\s*->\s*([\d.]+)\s*residual\s*\(\*\*([\d.]+)%\*\*",
+        content,
+    )
+    if rr_match:
+        result["inherent_score"] = float(rr_match.group(1))
+        result["residual_score"] = float(rr_match.group(2))
+        result["risk_reduction"] = float(rr_match.group(3))
+
+    # Parse: **Coverage**: 26% Found | 34% Partial | 40% Missing
+    cov_match = re.search(
+        r"\*\*Coverage\*\*:\s*([\d.]+)%\s*Found",
+        content,
+    )
+    if cov_match:
+        result["control_coverage_pct"] = float(cov_match.group(1))
+
     # ---- Section 4: Recommendations (parse first to merge into findings) ----
     recommendations = {}  # threat_id -> recommendation text
     sec4_start = None
diff --git a/tests/scripts/fixtures/golden/baseball-card.json b/tests/scripts/fixtures/golden/baseball-card.json
@@ -96,6 +96,10 @@
     }
   ],
   "template_data": {
+    "control_coverage_pct": 23.5,
+    "inherent_score": 214.5,
+    "residual_score": 156.8,
+    "risk_reduction": 26.9,
     "risk_weights": [
       {
         "annotation": "4 High + 5 Medium + 2 Low findings",
diff --git a/tests/scripts/fixtures/golden/risk-funnel.json b/tests/scripts/fixtures/golden/risk-funnel.json
@@ -96,6 +96,7 @@
     }
   ],
   "template_data": {
+    "control_coverage_pct": 23.5,
     "funnel_tiers": [
       {
         "count": 34,
@@ -122,6 +123,7 @@
         "tier": 3
       }
     ],
+    "inherent_score": 214.5,
     "missing_enrichments": [],
     "reduction_percentages": [
       {
@@ -140,6 +142,8 @@
         "to_tier": 3
       }
     ],
+    "residual_score": 156.8,
+    "risk_reduction": 26.9,
     "risk_weights": [
       {
         "annotation": "4 High + 5 Medium + 2 Low findings",

Original file line number	Diff line number	Diff line change
`@@ -96,6 +96,10 @@`
`96`	`96`	`}`
`97`	`97`	`],`
`98`	`98`	`"template_data": {`
	`99`	`+ "control_coverage_pct": 23.5,`
	`100`	`+ "inherent_score": 214.5,`
	`101`	`+ "residual_score": 156.8,`
	`102`	`+ "risk_reduction": 26.9,`
`99`	`103`	`"risk_weights": [`
`100`	`104`	`{`
`101`	`105`	`"annotation": "4 High + 5 Medium + 2 Low findings",`
Original file line number	Diff line number	Diff line change
`@@ -96,6 +96,7 @@`
`96`	`96`	`}`
`97`	`97`	`],`
`98`	`98`	`"template_data": {`
	`99`	`+ "control_coverage_pct": 23.5,`
`99`	`100`	`"funnel_tiers": [`
`100`	`101`	`{`
`101`	`102`	`"count": 34,`
`@@ -122,6 +123,7 @@`
`122`	`123`	`"tier": 3`
`123`	`124`	`}`
`124`	`125`	`],`
	`126`	`+ "inherent_score": 214.5,`
`125`	`127`	`"missing_enrichments": [],`
`126`	`128`	`"reduction_percentages": [`
`127`	`129`	`{`
`@@ -140,6 +142,8 @@`
`140`	`142`	`"to_tier": 3`
`141`	`143`	`}`
`142`	`144`	`],`
	`145`	`+ "residual_score": 156.8,`
	`146`	`+ "risk_reduction": 26.9,`
`143`	`147`	`"risk_weights": [`
`144`	`148`	`{`
`145`	`149`	`"annotation": "4 High + 5 Medium + 2 Low findings",`