Evals CI/CD UX overview improves pt. 1 by prathmeshpatel · Pull Request #1602 · MCPJam/inspector

prathmeshpatel · 2026-03-13T07:37:31Z

Summary

Enhanced Sidebar

Added status labels below status dots in the suite list sidebar
Added suite count display in the sidebar header
Added failure count badge on the Overview button when suites have failed runs
Improved status information display with color-coded labels

Overview Panel Enhancements

Renamed TagAggregationPanel to OverviewPanel and updated all references
Added color-coded card borders based on pass rates (green for ≥95%, amber for ≥75%, red for <75%)
Enhanced "No data" state handling with alert icons and appropriate messaging
Improved trend indicators with color coding (green for positive, red for negative trends)
Added search functionality for filtering suites in the breakdown section
Enhanced progress bars with color coding matching pass rate thresholds

chelojimenez · 2026-03-13T07:37:43Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scan Engine	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues
✅	Code Security	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

prathmeshpatel · 2026-03-13T07:37:44Z

graphite-app · 2026-03-16T08:50:13Z

Merge activity

Mar 16, 8:50 AM UTC: Graphite rebased this pull request, because this pull request is set to merge when ready.

coderabbitai · 2026-03-16T08:57:04Z

Walkthrough

This pull request restructures the evaluation suite display interface by introducing a new OverviewPanel component that consolidates suite data visualization. The changes replace the previous TagAggregationPanel with the new component, enhance status indicator functionality in the sidebar, and update the tag aggregation panel with improved data visibility logic and search filtering capabilities. The refactoring extends the props and styling system to support labelClass fields and adds UI elements for filtering, searching, and displaying suite health metrics.

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

mcpjam-inspector/client/src/components/evals/tag-aggregation-panel.tsx (1)

206-206: ⚠️ Potential issue | 🔴 Critical

Critical: useState called after early return violates Rules of Hooks.

The useState declaration on line 220 occurs after the early return on line 206. React requires hooks to be called unconditionally and in the same order on every render. When tagGroups.length transitions from 0 to non-zero, this will cause runtime errors.

🐛 Required fix: Move state declaration before early return

  const multiLineChartConfig = useMemo(() => {
    const config: Record<string, { label: string; color: string }> = {};
    visibleGroups.forEach((g, i) => {
      config[g.tag] = {
        label: g.tag,
        color: GROUP_COLORS[i % GROUP_COLORS.length],
      };
    });
    return config;
  }, [visibleGroups]);

  // Single-tag trend data (for AccuracyChart)
  const singleTagTrendData = useMemo(() => {
    if (visibleGroups.length !== 1) return [];
    const trend = groupTrends.get(visibleGroups[0].tag) ?? [];
    return trend.map((value, i) => ({
      runIdDisplay: `#${i + 1}`,
      passRate: toPercent(value),
    }));
  }, [visibleGroups, groupTrends]);

  // Bar chart fallback data — exclude groups with no data
  const passRateBarData = useMemo(
    () =>
      visibleGroups
        .filter(
          (g) => g.totals.passed + g.totals.failed > 0 || g.totals.runs > 0,
        )
        .map((g) => ({
          tag: g.tag,
          passRate: g.passRate,
          suiteCount: g.suiteCount,
          passed: g.totals.passed,
          failed: g.totals.failed,
        })),
    [visibleGroups],
  );

+ // Search state for suite breakdown
+ const [suiteSearch, setSuiteSearch] = useState("");
+
  if (tagGroups.length === 0) return null;

  const toggleTag = (tag: string) => {
    // ...
  };

  const isMultiGroup = visibleGroups.length > 1;

- // Search state for suite breakdown
- const [suiteSearch, setSuiteSearch] = useState("");

Also applies to: 219-220

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@mcpjam-inspector/client/src/components/evals/tag-aggregation-panel.tsx` at
line 206, The early return "if (tagGroups.length === 0) return null;" causes
hooks declared later (the useState on/around lines referencing tagGroups state)
to be conditionally invoked; move the useState hook(s) (the hook that declares
state for selectedTagGroup / any setSelectedTagGroup or similar used below) to
before this early return so all React hooks (useState in this component) are
called unconditionally and in the same order on every render; ensure any derived
values that depend on tagGroups remain computed after the hook declarations and
adjust the early return to occur only after hooks are declared.

🧹 Nitpick comments (4)

mcpjam-inspector/client/src/components/CiEvalsTab.tsx (1)

264-267: Hardcoding hasTags={true} may mask edge cases.

The isOverviewSelected now solely depends on !selectedSuiteId, and hasTags is hardcoded to true. This ensures the Overview button always renders, but the actual hasTags variable (line 90) still exists and is computed. If the intent is to always display Overview, consider removing the prop or documenting this decision.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcpjam-inspector/client/src/components/CiEvalsTab.tsx` around lines 264 -
267, The Overview button prop hasTags is hardcoded to true which hides the
computed hasTags variable defined earlier; update the usage in CiEvalsTab to
either pass the computed hasTags (replace hasTags={true} with hasTags={hasTags})
or remove the hasTags prop entirely if the Overview should always render, and
ensure isOverviewSelected still uses !selectedSuiteId and other props
(isLoading=queries.isOverviewLoading, filterTag) remain unchanged.

mcpjam-inspector/client/src/components/evals/overview-panel.tsx (2)

56-77: Sparkline height calculation can be simplified.

The expression (toPercent(value) / 100) * 100 is mathematically equivalent to toPercent(value). Since toPercent already returns a value in the 0-100 range, this simplifies readability.

♻️ Suggested simplification

  <div
    key={idx}
    className="w-1.5 rounded-sm bg-primary/70"
    style={{
-     height: `${Math.max(3, (toPercent(value) / 100) * 100)}%`,
+     height: `${Math.max(3, toPercent(value))}%`,
    }}
  />

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@mcpjam-inspector/client/src/components/evals/overview-panel.tsx` around lines
56 - 77, The Sparkline component's bar height uses an unnecessary calculation:
replace the style height expression in Sparkline (currently Math.max(3,
(toPercent(value) / 100) * 100)) with Math.max(3, toPercent(value)) to simplify
and make intent clearer; update the inline style in the mapped div inside
function Sparkline to use toPercent(value) directly.

602-611: Table column header "St" is cryptic.

Consider using "Status" or a status icon for the column header instead of the abbreviation "St", which may confuse users.

♻️ Suggested change

- <div>St</div>
+ <div className="sr-only">Status</div>

Or simply use a small icon or tooltip to convey meaning.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@mcpjam-inspector/client/src/components/evals/overview-panel.tsx` around lines
602 - 611, Replace the cryptic header label "St" in the header row (the JSX div
that currently contains the text "St" inside the grid header block) with a
clearer affordance: use "Status" as the visible label or swap it for a small
status icon with an accessible tooltip/title attribute ("Status") so screen
readers and hover users understand the column purpose; update the corresponding
JSX element in the overview panel header row accordingly and ensure any
CSS/className (the grid header div) still lays out correctly with the new text
or icon.

mcpjam-inspector/client/src/components/evals/ci-suite-list-sidebar.tsx (1)

134-143: Consider extracting the IIFE into a useMemo for clarity.

While the immediately-invoked function expression works, extracting the failure count computation into a memoized value would improve readability and align with React's declarative patterns.

♻️ Suggested refactor

+ const failCount = useMemo(
+   () => suites.filter((e) => e.latestRun?.result === "failed").length,
+   [suites]
+ );

  // Then in JSX:
- {(() => {
-   const failCount = suites.filter(
-     (e) => e.latestRun?.result === "failed",
-   ).length;
-   return failCount > 0 ? (
-     <span className="shrink-0 flex h-5 min-w-[20px] items-center justify-center rounded-full bg-destructive px-1.5 text-[10px] font-bold text-destructive-foreground">
-       {failCount}
-     </span>
-   ) : null;
- })()}
+ {failCount > 0 && (
+   <span className="shrink-0 flex h-5 min-w-[20px] items-center justify-center rounded-full bg-destructive px-1.5 text-[10px] font-bold text-destructive-foreground">
+     {failCount}
+   </span>
+ )}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@mcpjam-inspector/client/src/components/evals/ci-suite-list-sidebar.tsx`
around lines 134 - 143, Extract the inline IIFE that computes failCount into a
memoized value using React's useMemo: create const failCount = useMemo(() =>
suites.filter(e => e.latestRun?.result === "failed").length, [suites]) (ensure
useMemo is imported), then render the badge conditionally with {failCount > 0 &&
<span ...>{failCount}</span>} instead of the IIFE; this improves readability and
prevents re-computation on every render.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@mcpjam-inspector/client/src/components/evals/tag-aggregation-panel.tsx`:
- Line 206: The early return "if (tagGroups.length === 0) return null;" causes
hooks declared later (the useState on/around lines referencing tagGroups state)
to be conditionally invoked; move the useState hook(s) (the hook that declares
state for selectedTagGroup / any setSelectedTagGroup or similar used below) to
before this early return so all React hooks (useState in this component) are
called unconditionally and in the same order on every render; ensure any derived
values that depend on tagGroups remain computed after the hook declarations and
adjust the early return to occur only after hooks are declared.

---

Nitpick comments:
In `@mcpjam-inspector/client/src/components/CiEvalsTab.tsx`:
- Around line 264-267: The Overview button prop hasTags is hardcoded to true
which hides the computed hasTags variable defined earlier; update the usage in
CiEvalsTab to either pass the computed hasTags (replace hasTags={true} with
hasTags={hasTags}) or remove the hasTags prop entirely if the Overview should
always render, and ensure isOverviewSelected still uses !selectedSuiteId and
other props (isLoading=queries.isOverviewLoading, filterTag) remain unchanged.

In `@mcpjam-inspector/client/src/components/evals/ci-suite-list-sidebar.tsx`:
- Around line 134-143: Extract the inline IIFE that computes failCount into a
memoized value using React's useMemo: create const failCount = useMemo(() =>
suites.filter(e => e.latestRun?.result === "failed").length, [suites]) (ensure
useMemo is imported), then render the badge conditionally with {failCount > 0 &&
<span ...>{failCount}</span>} instead of the IIFE; this improves readability and
prevents re-computation on every render.

In `@mcpjam-inspector/client/src/components/evals/overview-panel.tsx`:
- Around line 56-77: The Sparkline component's bar height uses an unnecessary
calculation: replace the style height expression in Sparkline (currently
Math.max(3, (toPercent(value) / 100) * 100)) with Math.max(3, toPercent(value))
to simplify and make intent clearer; update the inline style in the mapped div
inside function Sparkline to use toPercent(value) directly.
- Around line 602-611: Replace the cryptic header label "St" in the header row
(the JSX div that currently contains the text "St" inside the grid header block)
with a clearer affordance: use "Status" as the visible label or swap it for a
small status icon with an accessible tooltip/title attribute ("Status") so
screen readers and hover users understand the column purpose; update the
corresponding JSX element in the overview panel header row accordingly and
ensure any CSS/className (the grid header div) still lays out correctly with the
new text or icon.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ec55c23-77ba-4aed-a250-9da6c97533b6

📥 Commits

Reviewing files that changed from the base of the PR and between f7c6a0b and eed35b3.

📒 Files selected for processing (4)

mcpjam-inspector/client/src/components/CiEvalsTab.tsx
mcpjam-inspector/client/src/components/evals/ci-suite-list-sidebar.tsx
mcpjam-inspector/client/src/components/evals/overview-panel.tsx
mcpjam-inspector/client/src/components/evals/tag-aggregation-panel.tsx

prathmeshpatel mentioned this pull request Mar 13, 2026

[SDK] Add group filtering #1595

Merged

prathmeshpatel changed the title ~~basic improves, status labels, suite count, progress bars, no data~~ Evals CI/CD UX overview improves Mar 13, 2026

prathmeshpatel changed the title ~~Evals CI/CD UX overview improves~~ Evals CI/CD UX overview improves pt. 1 Mar 13, 2026

prathmeshpatel requested review from chelojimenez and ignaciojimenezr March 13, 2026 07:38

railway-app bot temporarily deployed to triumphant-alignment / staging March 13, 2026 21:46 Inactive

prathmeshpatel force-pushed the evals-ux-improves-1 branch from 35b3df7 to e85ec86 Compare March 14, 2026 07:18

prathmeshpatel force-pushed the sdk-group-charts branch from 69b9b2b to 9a0a8f8 Compare March 14, 2026 07:18

railway-app bot temporarily deployed to triumphant-alignment / staging March 14, 2026 07:18 Inactive

This was referenced Mar 14, 2026

Improve commit detail view #1607

Merged

Frontend AI triage insights #1608

Merged

Group runs by commit SHA, update chip labels with commit time and pass fail overview #1609

Merged

Remove AI triage frontend call logic and cleanup AI look #1610

Merged

railway-app bot temporarily deployed to triumphant-alignment / staging-prathmesh March 14, 2026 07:21 Inactive

This was referenced Mar 14, 2026

Hide AI triage panel when unavailable and prevent auto-requests #1611

Merged

Clean up sidebar and other top improves #1612

Merged

Simplify evals dash cont. #1613

Merged

prathmeshpatel force-pushed the evals-ux-improves-1 branch from e85ec86 to c6011fe Compare March 14, 2026 21:56

prathmeshpatel force-pushed the sdk-group-charts branch from 9a0a8f8 to 4097e83 Compare March 14, 2026 21:57

prathmeshpatel force-pushed the evals-ux-improves-1 branch from c6011fe to 48371ee Compare March 14, 2026 22:24

prathmeshpatel mentioned this pull request Mar 14, 2026

Auto-switch to suite view for manual runs + add progress bars and pagination to failures #1614

Merged

prathmeshpatel force-pushed the evals-ux-improves-1 branch from 48371ee to b134082 Compare March 16, 2026 06:27

prathmeshpatel force-pushed the sdk-group-charts branch from 50a98ed to 6907244 Compare March 16, 2026 06:27

prathmeshpatel mentioned this pull request Mar 16, 2026

Remove CI evals overview panel and tag filtering #1623

Merged

prathmeshpatel marked this pull request as ready for review March 16, 2026 08:47

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 16, 2026

Base automatically changed from sdk-group-charts to main March 16, 2026 08:48

dosubot bot added the enhancement New feature or request label Mar 16, 2026

prathmeshpatel force-pushed the evals-ux-improves-1 branch from b134082 to 47f3ded Compare March 16, 2026 08:49

prathmeshpatel added 3 commits March 16, 2026 08:49

basic improves, status labels, suite count, progress bars, no data

148543c

overview improves

ad3d3c3

overview

5a3f4bb

prathmeshpatel force-pushed the evals-ux-improves-1 branch from 47f3ded to 5a3f4bb Compare March 16, 2026 08:49

style: auto-fix prettier formatting

eed35b3

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

prathmeshpatel merged commit 778405b into main Mar 16, 2026
5 checks passed

prathmeshpatel deleted the evals-ux-improves-1 branch March 16, 2026 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Evals CI/CD UX overview improves pt. 1#1602

Evals CI/CD UX overview improves pt. 1#1602
prathmeshpatel merged 4 commits intomainfrom
evals-ux-improves-1

prathmeshpatel commented Mar 13, 2026 •

edited

Loading

Uh oh!

chelojimenez commented Mar 13, 2026 •

edited

Loading

Uh oh!

prathmeshpatel commented Mar 13, 2026 •

edited

Loading

Uh oh!

graphite-app bot commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

prathmeshpatel commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Enhanced Sidebar

Overview Panel Enhancements

Uh oh!

chelojimenez commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

prathmeshpatel commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

graphite-app bot commented Mar 16, 2026

Merge activity

Uh oh!

coderabbitai bot commented Mar 16, 2026

Walkthrough

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

prathmeshpatel commented Mar 13, 2026 •

edited

Loading

chelojimenez commented Mar 13, 2026 •

edited

Loading

prathmeshpatel commented Mar 13, 2026 •

edited

Loading