Skip to content

Commit 89d2c00

Browse files
committed
feat(decision): replace model.output.generate with dedicated capabilities
- Add eval.option.analyze: qualitative analysis (pros, cons, risks, assumptions) - Add decision.option.justify: select + justify final recommendation - Update decision.make pipeline: 5 steps, zero oracle calls - Pipeline: generate → analyze → score → justify → quality-gate - All capability names use existing vocabulary (0 new terms)
1 parent 2af76d0 commit 89d2c00

File tree

8 files changed

+443
-240
lines changed

8 files changed

+443
-240
lines changed

capabilities/_index.yaml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,12 @@ capabilities:
2828

2929
- id: eval.option.score
3030
status: experimental
31-
description: Evaluate options against weighted criteria with comparative scoring and trade-offs.
31+
description: Evaluate options against weighted criteria with comparative scoring and trade-offs.
32+
33+
- id: eval.option.analyze
34+
status: experimental
35+
description: Qualitative analysis of options — pros, cons, risks, and assumptions per option.
36+
37+
- id: decision.option.justify
38+
status: experimental
39+
description: Select a final recommendation from evaluated options with structured justification.
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
id: decision.option.justify
2+
version: 1.0.0
3+
description: >
4+
Select a final recommendation from evaluated options and produce a structured
5+
justification. Consumes qualitative analysis (from eval.option.analyze) and
6+
quantitative scores (from eval.option.score) to make an explicit, committed
7+
choice. Outputs the recommendation, justification narrative, failure modes,
8+
uncertainties, confidence assessment, and concrete next steps.
9+
10+
This capability is the "closing" step — it resolves ambiguity into action.
11+
It must never hedge with "it depends" or "consider both".
12+
13+
inputs:
14+
scored_options:
15+
type: array
16+
required: true
17+
description: >
18+
Options with quantitative scores from eval.option.score. Each entry
19+
includes option_id, overall_score, per_criterion_scores, strengths,
20+
weaknesses.
21+
22+
analyzed_options:
23+
type: array
24+
required: true
25+
description: >
26+
Options with qualitative analysis from eval.option.analyze. Each entry
27+
includes option_id, pros, cons, risks, assumptions.
28+
29+
tradeoffs:
30+
type: array
31+
required: false
32+
description: >
33+
Explicit trade-offs between top candidates from eval.option.score.
34+
35+
goal:
36+
type: string
37+
required: true
38+
description: >
39+
The original decision goal for contextual grounding.
40+
41+
constraints:
42+
type: object
43+
required: false
44+
description: >
45+
Hard constraints the decision must respect (budget, time, regulation).
46+
47+
risk_tolerance:
48+
type: string
49+
required: false
50+
description: >
51+
Risk appetite: low | medium | high. Modulates how uncertainty and
52+
downside scenarios influence the final selection.
53+
54+
outputs:
55+
recommendation:
56+
type: string
57+
required: true
58+
description: >
59+
The final recommendation. Clear, concrete, actionable. Never hedged.
60+
61+
alternatives_considered:
62+
type: array
63+
required: true
64+
description: >
65+
All options that were evaluated, with a brief note on why each was
66+
or was not selected.
67+
68+
confidence_score:
69+
type: number
70+
required: true
71+
description: >
72+
Numeric confidence in the recommendation (0.0-1.0). Calibrated:
73+
0.0-0.3 low, 0.3-0.6 medium, 0.6-0.9 high, 0.9-1.0 very high.
74+
75+
confidence_level:
76+
type: string
77+
required: true
78+
description: >
79+
Human-readable confidence: low | medium | high.
80+
81+
uncertainties:
82+
type: array
83+
required: true
84+
description: >
85+
What is unknown or weakly supported that could change the outcome.
86+
87+
failure_modes:
88+
type: array
89+
required: true
90+
description: >
91+
Concrete conditions under which the recommendation could fail.
92+
Specific, not generic.
93+
94+
next_steps:
95+
type: array
96+
required: true
97+
description: >
98+
Concrete follow-up actions: validations, pilots, execution steps.
99+
Prefers action over "more analysis".
100+
101+
human_readable:
102+
type: string
103+
required: true
104+
description: >
105+
3-6 paragraph narrative of the decision suitable for direct human
106+
consumption. Includes what was decided, why, key trade-offs, risks,
107+
and next steps.
108+
109+
properties:
110+
deterministic: false
111+
side_effects: false
112+
idempotent: true
113+
114+
metadata:
115+
status: experimental
116+
tags:
117+
- decision-support
118+
- justification
119+
- recommendation
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
id: eval.option.analyze
2+
version: 1.0.0
3+
description: >
4+
Perform qualitative analysis of each option in a decision set. For every
5+
option, identify pros, cons, risks, and underlying assumptions. Does NOT
6+
assign numeric scores — that is the responsibility of eval.option.score.
7+
This capability produces the structured qualitative evidence that scoring
8+
and justification steps consume downstream.
9+
10+
inputs:
11+
options:
12+
type: array
13+
required: true
14+
description: >
15+
List of options to analyze. Each option should have at minimum: id, label,
16+
description. May include key_attributes from agent.option.generate.
17+
18+
context:
19+
type: string
20+
required: false
21+
description: >
22+
Background information, prior analysis, or domain knowledge relevant to
23+
the analysis. Grounds the pros/cons in evidence rather than speculation.
24+
25+
goal:
26+
type: string
27+
required: true
28+
description: >
29+
The decision goal that frames which pros, cons, and risks matter.
30+
31+
outputs:
32+
analyzed_options:
33+
type: array
34+
required: true
35+
description: >
36+
Options enriched with qualitative analysis. Each entry includes: option_id,
37+
pros (array of strings), cons (array of strings), risks (array of objects
38+
with description and severity), assumptions (array of strings — premises
39+
that must hold for this option to work as expected).
40+
41+
analysis_notes:
42+
type: string
43+
required: false
44+
description: >
45+
Brief note on methodology, gaps in available evidence, or caveats
46+
about the analysis.
47+
48+
properties:
49+
deterministic: false
50+
side_effects: false
51+
idempotent: true
52+
53+
metadata:
54+
status: experimental
55+
tags:
56+
- decision-support
57+
- evaluation
58+
- qualitative-analysis

catalog/capabilities.json

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -478,6 +478,101 @@
478478
"idempotent": true
479479
}
480480
},
481+
{
482+
"id": "decision.option.justify",
483+
"version": "1.0.0",
484+
"description": "Select a final recommendation from evaluated options and produce a structured justification. Consumes qualitative analysis (from eval.option.analyze) and quantitative scores (from eval.option.score) to make an explicit, committed choice. Outputs the recommendation, justification narrative, failure modes, uncertainties, confidence assessment, and concrete next steps.\nThis capability is the \"closing\" step — it resolves ambiguity into action. It must never hedge with \"it depends\" or \"consider both\".\n",
485+
"file": "capabilities/decision.option.justify.yaml",
486+
"inputs": {
487+
"scored_options": {
488+
"type": "array",
489+
"required": true,
490+
"description": "Options with quantitative scores from eval.option.score. Each entry includes option_id, overall_score, per_criterion_scores, strengths, weaknesses.\n"
491+
},
492+
"analyzed_options": {
493+
"type": "array",
494+
"required": true,
495+
"description": "Options with qualitative analysis from eval.option.analyze. Each entry includes option_id, pros, cons, risks, assumptions.\n"
496+
},
497+
"tradeoffs": {
498+
"type": "array",
499+
"required": false,
500+
"description": "Explicit trade-offs between top candidates from eval.option.score.\n"
501+
},
502+
"goal": {
503+
"type": "string",
504+
"required": true,
505+
"description": "The original decision goal for contextual grounding.\n"
506+
},
507+
"constraints": {
508+
"type": "object",
509+
"required": false,
510+
"description": "Hard constraints the decision must respect (budget, time, regulation).\n"
511+
},
512+
"risk_tolerance": {
513+
"type": "string",
514+
"required": false,
515+
"description": "Risk appetite: low | medium | high. Modulates how uncertainty and downside scenarios influence the final selection.\n"
516+
}
517+
},
518+
"outputs": {
519+
"recommendation": {
520+
"type": "string",
521+
"required": true,
522+
"description": "The final recommendation. Clear, concrete, actionable. Never hedged.\n"
523+
},
524+
"alternatives_considered": {
525+
"type": "array",
526+
"required": true,
527+
"description": "All options that were evaluated, with a brief note on why each was or was not selected.\n"
528+
},
529+
"confidence_score": {
530+
"type": "number",
531+
"required": true,
532+
"description": "Numeric confidence in the recommendation (0.0-1.0). Calibrated: 0.0-0.3 low, 0.3-0.6 medium, 0.6-0.9 high, 0.9-1.0 very high.\n"
533+
},
534+
"confidence_level": {
535+
"type": "string",
536+
"required": true,
537+
"description": "Human-readable confidence: low | medium | high.\n"
538+
},
539+
"uncertainties": {
540+
"type": "array",
541+
"required": true,
542+
"description": "What is unknown or weakly supported that could change the outcome.\n"
543+
},
544+
"failure_modes": {
545+
"type": "array",
546+
"required": true,
547+
"description": "Concrete conditions under which the recommendation could fail. Specific, not generic.\n"
548+
},
549+
"next_steps": {
550+
"type": "array",
551+
"required": true,
552+
"description": "Concrete follow-up actions: validations, pilots, execution steps. Prefers action over \"more analysis\".\n"
553+
},
554+
"human_readable": {
555+
"type": "string",
556+
"required": true,
557+
"description": "3-6 paragraph narrative of the decision suitable for direct human consumption. Includes what was decided, why, key trade-offs, risks, and next steps.\n"
558+
}
559+
},
560+
"metadata": {
561+
"tags": [
562+
"decision-support",
563+
"justification",
564+
"recommendation"
565+
],
566+
"category": null,
567+
"status": "experimental",
568+
"examples": []
569+
},
570+
"properties": {
571+
"deterministic": false,
572+
"side_effects": false,
573+
"idempotent": true
574+
}
575+
},
481576
{
482577
"id": "doc.chunk",
483578
"version": "1.0.0",
@@ -591,6 +686,56 @@
591686
"idempotent": true
592687
}
593688
},
689+
{
690+
"id": "eval.option.analyze",
691+
"version": "1.0.0",
692+
"description": "Perform qualitative analysis of each option in a decision set. For every option, identify pros, cons, risks, and underlying assumptions. Does NOT assign numeric scores — that is the responsibility of eval.option.score. This capability produces the structured qualitative evidence that scoring and justification steps consume downstream.\n",
693+
"file": "capabilities/eval.option.analyze.yaml",
694+
"inputs": {
695+
"options": {
696+
"type": "array",
697+
"required": true,
698+
"description": "List of options to analyze. Each option should have at minimum: id, label, description. May include key_attributes from agent.option.generate.\n"
699+
},
700+
"context": {
701+
"type": "string",
702+
"required": false,
703+
"description": "Background information, prior analysis, or domain knowledge relevant to the analysis. Grounds the pros/cons in evidence rather than speculation.\n"
704+
},
705+
"goal": {
706+
"type": "string",
707+
"required": true,
708+
"description": "The decision goal that frames which pros, cons, and risks matter.\n"
709+
}
710+
},
711+
"outputs": {
712+
"analyzed_options": {
713+
"type": "array",
714+
"required": true,
715+
"description": "Options enriched with qualitative analysis. Each entry includes: option_id, pros (array of strings), cons (array of strings), risks (array of objects with description and severity), assumptions (array of strings — premises that must hold for this option to work as expected).\n"
716+
},
717+
"analysis_notes": {
718+
"type": "string",
719+
"required": false,
720+
"description": "Brief note on methodology, gaps in available evidence, or caveats about the analysis.\n"
721+
}
722+
},
723+
"metadata": {
724+
"tags": [
725+
"decision-support",
726+
"evaluation",
727+
"qualitative-analysis"
728+
],
729+
"category": null,
730+
"status": "experimental",
731+
"examples": []
732+
},
733+
"properties": {
734+
"deterministic": false,
735+
"side_effects": false,
736+
"idempotent": true
737+
}
738+
},
594739
{
595740
"id": "eval.option.score",
596741
"version": "1.0.0",

catalog/graph.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,9 +52,10 @@
5252
"decision.make": {
5353
"capabilities": [
5454
"agent.option.generate",
55+
"decision.option.justify",
56+
"eval.option.analyze",
5557
"eval.option.score",
56-
"eval.output.score",
57-
"model.output.generate"
58+
"eval.output.score"
5859
],
5960
"skills": []
6061
},

0 commit comments

Comments
 (0)