From fdaf2f9b534ad8742e4e6379751a786eb398e7eb Mon Sep 17 00:00:00 2001 From: sadlilas <11658960+sadlilas@users.noreply.github.com> Date: Thu, 18 Jun 2026 10:48:59 -0700 Subject: [PATCH] fix(baa-dev): route session analysis to session-analyst; ground architect reviews in tool evidence MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two targeted, structural fixes to the behavioral-anchor-amplifier-dev experiment bundle, ahead of a verifying measurement sweep. FIX 1 (session-analyst routing): register `foundation:session-analyst` as a delegable agent and add a "Delegate session analysis" principle to the system prompt. Closes a model-agnostic defect where session-analysis / events.jsonl tasks were not routed to the specialist (base BAA scored 2/2/2 vs amplifier-dev 10/9/9 across opus-4.7/opus-4.8/gpt-5.5). FIX 2 (architect verification): grant the architect agent `tool-web` and turn its cite-evidence rule into an evidence gate, so PR/code reviews must fetch and read what they assert rather than confabulating (addresses the pr185 fabricated-review failure on opus-4.7). Deliberately scoped: kernel-vocab phrasing, anti-paralysis, and DTU-persistence changes were evaluated and EXCLUDED as not worth the cost / not reproducible defects. Impact to be confirmed by a measurement run after merge. 🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier) Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com> --- .../behavioral-anchor-amplifier-dev/agents/architect.md | 4 +++- .../behavioral-anchor-amplifier-dev.md | 2 ++ experiments/behavioral-anchor-amplifier-dev/context/system.md | 2 ++ 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/experiments/behavioral-anchor-amplifier-dev/agents/architect.md b/experiments/behavioral-anchor-amplifier-dev/agents/architect.md index ee741c8..83ebfdf 100644 --- a/experiments/behavioral-anchor-amplifier-dev/agents/architect.md +++ b/experiments/behavioral-anchor-amplifier-dev/agents/architect.md @@ -20,6 +20,8 @@ tools: source: git+https://github.com/microsoft/amplifier-module-tool-filesystem@main - module: tool-search source: git+https://github.com/microsoft/amplifier-module-tool-search@main + - module: tool-web + source: git+https://github.com/microsoft/amplifier-module-tool-web@main --- # Architect @@ -37,4 +39,4 @@ You produce actionable specifications and design reviews. 1. Every abstraction must justify its existence. 2. Start with the simplest viable design. 3. Specs must include: file paths, interfaces with types, success criteria. -4. Reviews must cite specific `file_path:line_number` evidence. +4. Reviews must cite specific `file_path:line_number` evidence read via a tool call in THIS session. Never assert line counts, file contents, or duplication you have not actually read or fetched (use `tool-web` to fetch a PR/diff before reviewing it). If you could not read it, say so — do not describe it. diff --git a/experiments/behavioral-anchor-amplifier-dev/behavioral-anchor-amplifier-dev.md b/experiments/behavioral-anchor-amplifier-dev/behavioral-anchor-amplifier-dev.md index 4bae019..e4e0dbc 100644 --- a/experiments/behavioral-anchor-amplifier-dev/behavioral-anchor-amplifier-dev.md +++ b/experiments/behavioral-anchor-amplifier-dev/behavioral-anchor-amplifier-dev.md @@ -128,6 +128,8 @@ agents: - behavioral-anchor-amplifier-dev:git-ops - behavioral-anchor-amplifier-dev:researcher - behavioral-anchor-amplifier-dev:amplifier-dev-expert + # Session analysis/repair: required route for events.jsonl + broken-session work + - foundation:session-analyst --- # Behavioral Anchor diff --git a/experiments/behavioral-anchor-amplifier-dev/context/system.md b/experiments/behavioral-anchor-amplifier-dev/context/system.md index 14d7af3..7b070ad 100644 --- a/experiments/behavioral-anchor-amplifier-dev/context/system.md +++ b/experiments/behavioral-anchor-amplifier-dev/context/system.md @@ -46,4 +46,6 @@ Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.co **Delegate dev-ecosystem questions.** "How does Amplifier work?" and "how do I author a bundle?" both go to the amplifier-dev-expert agent — it holds the authoritative knowledge. +**Delegate session analysis.** Analyzing, debugging, searching, or repairing Amplifier sessions — and any reading of `events.jsonl` — goes to the `foundation:session-analyst` agent. Never read `events.jsonl` directly; its lines can exceed 100k tokens and will crash the session. + Ecosystem and bundle-authoring knowledge lives in the **amplifier-dev-expert** agent.