Commit 70f86ca
feat(agent): MVE Experiment Designer (#976)
# Pull Request
## Description
Adds a new conversational coaching agent that guides users through
designing a Minimum Viable Experiment (MVE). The agent follows a
structured, phase-based process — from problem discovery and hypothesis
formation through viability vetting to a complete experiment plan. It
helps users translate unknowns and assumptions into crisp, testable
hypotheses, evaluates experiment feasibility, and produces actionable
MVE plans with session tracking via .copilot-tracking. Includes the
agent definition (experiment-designer.agent.md) and companion
instructions (experiment-designer.instructions.md) covering MVE domain
knowledge, vetting criteria, and experiment type reference.
## Related Issue(s)
Closes #973
## Type of Change
Select all that apply:
**Code & Documentation:**
* [ ] Bug fix (non-breaking change fixing an issue)
* [x] New feature (non-breaking change adding functionality)
* [ ] Breaking change (fix or feature causing existing functionality to
change)
* [ ] Documentation update
**Infrastructure & Configuration:**
* [ ] GitHub Actions workflow
* [ ] Linting configuration (markdown, PowerShell, etc.)
* [ ] Security configuration
* [ ] DevContainer configuration
* [ ] Dependency update
**AI Artifacts:**
* [x] Reviewed contribution with `prompt-builder` agent and addressed
all feedback
* [x] Copilot instructions (`.github/instructions/*.instructions.md`)
* [ ] Copilot prompt (`.github/prompts/*.prompt.md`)
* [x] Copilot agent (`.github/agents/*.agent.md`)
* [ ] Copilot skill (`.github/skills/*/SKILL.md`)
> Note for AI Artifact Contributors:
>
> * Agents: Research, indexing/referencing other project (using standard
VS Code GitHub Copilot/MCP tools), planning, and general implementation
agents likely already exist. Review `.github/agents/` before creating
new ones.
> * Skills: Must include both bash and PowerShell scripts. See
[Skills](../docs/contributing/skills.md).
> * Model Versions: Only contributions targeting the **latest Anthropic
and OpenAI models** will be accepted. Older model versions (e.g.,
GPT-3.5, Claude 3) will be rejected.
> * See [Agents Not
Accepted](../docs/contributing/custom-agents.md#agents-not-accepted) and
[Model Version
Requirements](../docs/contributing/ai-artifacts-common.md#model-version-requirements).
**Other:**
* [ ] Script/automation (`.ps1`, `.sh`, `.py`)
* [ ] Other (please describe):
## Sample Prompts (for AI Artifact Contributions)
<!-- If you checked any boxes under "AI Artifacts" above, provide a
sample prompt showing how to use your contribution -->
<!-- Delete this section if not applicable -->
**User Request:**
<!-- What natural language request would trigger this
agent/prompt/instruction? -->
- "I have an idea for [feature/product/approach] but I'm not sure if it
will work. Help me design an experiment to validate it before we commit
to building it."
- "We need to test whether [assumption] is true before starting
development"
- "Help me design an MVE for [project/feature]"
- "Our customer wants us to build X, but there are unknowns around data
feasibility / architecture / LLM capability — can we experiment first?"
- "I want to validate my hypothesis about [topic] with a structured
experiment"
**Execution Flow:**
<!-- Step-by-step: what happens when invoked? Include tool usage,
decision points -->
Phase 1 — Problem & Context Discovery: Agent asks probing questions
about the problem statement, customer context, business case, unknowns,
and constraints. Creates a tracking directory at
.copilot-tracking/mve/{date}/{experiment-name}/ and writes context.md.
Phase 2 — Hypothesis Formation: Agent guides user to translate unknowns
into testable hypotheses using the format "We believe [assumption]. We
will test this by [method]. We will know we are right/wrong when
[measurable outcome]." Prioritizes hypotheses by risk and impact. Writes
hypotheses.md.
Phase 3 — MVE Vetting & Red Flag Check: Agent applies four vetting
criteria (business sense, crisp problem statement, Responsible AI, clear
next steps) and checks against nine red flag patterns (demos, skipping
ahead, solved problems, mini-MVP, etc.). Writes vetting.md. If
fundamental problems found, returns to Phase 1 or 2.
Phase 4 — Experiment Design: Agent helps choose experiment type, define
technical approach, set measurable success/failure criteria per
hypothesis, scope timeline to weeks, and plan post-experiment
evaluation. Writes experiment-design.md.
Phase 5 — MVE Plan Output: Agent consolidates all phase outputs into a
single mve-plan.md document for stakeholder review. Iterates based on
user feedback, returning to earlier phases if needed.
**Output Artifacts:**
<!-- What files/content are created? Show first 10-20 lines as preview
-->
context.md — Problem statement, customer context, business justification
hypotheses.md — Prioritized testable hypotheses with
assumption/method/outcome
vetting.md — Vetting criteria results and red flag assessment
experiment-design.md — Approach, scope, timeline, resources, success
criteria
mve-plan.md — Consolidated plan document for stakeholder review
```plain-text
<!-- markdownlint-disable-file -->
# MVE Context: {experiment-name}
## Problem Statement
{User's refined problem statement}
## Customer & Stakeholder Context
{Customer details, priority level, sponsors}
## Known Constraints
{IP, data access, timeline constraints}
## Assumptions & Unknowns
- Unknown 1: ...
- Assumption 1: ...
```
## Business Case
{Why this experiment matters, what decision it informs}
**Success Indicators:**
<!-- How does user know it worked correctly? What validation should they
perform? -->
The .copilot-tracking/mve/{date}/{experiment-name}/ directory contains
all five markdown artifacts (context.md, hypotheses.md, vetting.md,
experiment-design.md, mve-plan.md)
Each hypothesis follows the three-part format: assumption, test method,
measurable outcome
Hypotheses are prioritized by risk and impact with clear rationale
Vetting results explicitly address all four criteria and flag any red
flags encountered
Success and failure criteria are defined per hypothesis with
quantitative thresholds
The experiment is scoped to weeks (not months) with explicit
out-of-scope boundaries
mve-plan.md includes next steps for both validated and invalidated
outcomes
The agent challenged vague problem statements or untestable hypotheses
rather than accepting them uncritically
For detailed contribution requirements, see:
* Common Standards:
[docs/contributing/ai-artifacts-common.md](../docs/contributing/ai-artifacts-common.md)
- Shared standards for XML blocks, markdown quality, RFC 2119,
validation, and testing
* Agents:
[docs/contributing/custom-agents.md](../docs/contributing/custom-agents.md)
- Agent configurations with tools and behavior patterns
* Prompts:
[docs/contributing/prompts.md](../docs/contributing/prompts.md) -
Workflow-specific guidance with template variables
* Instructions:
[docs/contributing/instructions.md](../docs/contributing/instructions.md)
- Technology-specific standards with glob patterns
* Skills: [docs/contributing/skills.md](../docs/contributing/skills.md)
- Task execution utilities with cross-platform scripts
## Testing
<!-- Describe how you tested these changes -->
I've used it for a few MVE opportunities to help refine our hypotheses
and plan our MVE.
## Checklist
### Required Checks
* [x ] Documentation is updated (if applicable)
* [x ] Files follow existing naming conventions
* [x ] Changes are backwards compatible (if applicable)
* [N/A ] Tests added for new functionality (if applicable)
### AI Artifact Contributions
<!-- If contributing an agent, prompt, instruction, or skill, complete
these checks -->
* [ ] Used `/prompt-analyze` to review contribution
* [x ] Addressed all feedback from `prompt-builder` review
* [x ] Verified contribution follows common standards and type-specific
requirements
### Required Automated Checks
The following validation commands must pass before merging:
* [ ] Markdown linting: `npm run lint:md`
* [ ] Spell checking: `npm run spell-check`
* [ ] Frontmatter validation: `npm run lint:frontmatter`
* [ ] Skill structure validation: `npm run validate:skills`
* [ ] Link validation: `npm run lint:md-links`
* [ ] PowerShell analysis: `npm run lint:ps`
* [ ] Plugin freshness: `npm run plugin:generate`
(can't run dev container, hoping ci/cd pipeline checks these :) )
## Security Considerations
<!-- 1 parent c2b806f commit 70f86ca
File tree
15 files changed
+640
-43
lines changed- .github
- agents
- ado
- experimental
- github
- instructions
- experimental
- github
- collections
- docs/hve-guide/lifecycle
- plugins
- experimental
- agents
- instructions
- hve-core-all
- agents
- instructions
15 files changed
+640
-43
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
107 | | - | |
108 | | - | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
| |||
Lines changed: 221 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
0 commit comments