Skip to content

Commit 49e04d2

Browse files
committed
feat: add research.generate-briefing skill with 4 new capabilities
New capabilities: - analysis.theme.cluster: group items by semantic theme - text.merge: merge item array into single text block - web.source.normalize: reshape search results to corpus items (quick/deep) - web.source.search: renamed from web.search (3-segment consistency) New skill (experimental): - research.generate-briefing: 8-step market intelligence pipeline (search → normalize → resolve → merge → extract/summarize → cluster → risks → quality) Other changes: - vocabulary: added 'theme' (noun) and 'cluster' (verb) - docs: CAPABILITY_GAP_ANALYSIS.md documenting binding audit pattern - catalog: regenerated (111 capabilities, 35 skills)
1 parent 9ab5356 commit 49e04d2

File tree

15 files changed

+1162
-114
lines changed

15 files changed

+1162
-114
lines changed

capabilities/_index.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,16 @@ capabilities:
4444

4545
- id: analysis.risk.extract
4646
status: experimental
47-
description: Extract risks, fragile assumptions, failure modes, and mitigation ideas from a target artifact.
47+
description: Extract risks, fragile assumptions, failure modes, and mitigation ideas from a target artifact.
48+
49+
- id: analysis.theme.cluster
50+
status: experimental
51+
description: Group text items into coherent thematic clusters with summaries and signal strength.
52+
53+
- id: web.source.normalize
54+
status: experimental
55+
description: Normalize web search results into corpus item format for downstream analysis.
56+
57+
- id: text.merge
58+
status: experimental
59+
description: Merge multiple text items into a single consolidated text block.
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
id: analysis.theme.cluster
2+
version: 1.0.0
3+
description: >
4+
Group a collection of text items into coherent thematic clusters.
5+
Identifies dominant themes, assigns each item to one or more clusters,
6+
and produces a summary per cluster with signal strength. Accepts optional
7+
hint labels to guide (not force) the thematic structure.
8+
9+
inputs:
10+
items:
11+
type: array
12+
required: true
13+
description: >
14+
Items to cluster. Each item must include id and content (text).
15+
May also include title, type, source, metadata.
16+
hint_labels:
17+
type: array
18+
required: false
19+
description: >
20+
Suggested theme labels to guide clustering. The implementation may
21+
merge, split, rename, or add themes beyond hints. Example:
22+
["market_overview", "key_players", "trends", "risks", "opportunities"].
23+
max_clusters:
24+
type: number
25+
required: false
26+
description: Maximum number of clusters to produce (default 8, hard cap 15).
27+
context:
28+
type: string
29+
required: false
30+
description: Background context informing how items should be grouped.
31+
32+
outputs:
33+
clusters:
34+
type: array
35+
required: true
36+
description: >
37+
Themed clusters. Each cluster contains: theme (label), description,
38+
item_ids (list of assigned item ids), summary (text summary of the
39+
cluster content), signal_strength (0-1 estimate of how well-supported
40+
the theme is by the corpus).
41+
unclustered:
42+
type: array
43+
required: false
44+
description: >
45+
Items that did not fit any cluster. Each entry contains id and reason.
46+
cluster_quality:
47+
type: object
48+
required: false
49+
description: >
50+
Self-assessment of clustering quality: coherence_score (0-1),
51+
coverage_ratio (fraction of items assigned), overlap_warnings
52+
(list of items assigned to multiple clusters).
53+
54+
properties:
55+
deterministic: false
56+
side_effects: false
57+
idempotent: true
58+
59+
metadata:
60+
status: experimental
61+
tags: [analysis, clustering, themes, structuring]

capabilities/text.merge.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
id: text.merge
2+
version: 1.0.0
3+
description: >
4+
Merge multiple text items into a single consolidated text block.
5+
Accepts an array of items with text content and produces a single
6+
string with configurable separator. Deterministic, no LLM required.
7+
8+
inputs:
9+
items:
10+
type: array
11+
required: true
12+
description: >
13+
Items to merge. Each item must include a content field (string).
14+
May also include id and title which are used as section headers
15+
when include_headers is true.
16+
separator:
17+
type: string
18+
required: false
19+
description: >
20+
Separator between merged items. Defaults to double newline.
21+
include_headers:
22+
type: boolean
23+
required: false
24+
description: >
25+
Whether to include item titles as section headers in the merged
26+
text. Defaults to true when items have titles.
27+
28+
outputs:
29+
text:
30+
type: string
31+
required: true
32+
description: Merged text block from all items.
33+
item_count:
34+
type: number
35+
required: true
36+
description: Number of items that contributed content to the merged text.
37+
38+
properties:
39+
deterministic: true
40+
side_effects: false
41+
idempotent: true
42+
43+
metadata:
44+
status: experimental
45+
tags: [text, merge, preprocessing, utility]
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
id: web.source.normalize
2+
version: 1.0.0
3+
description: >
4+
Normalize web search results into corpus item format suitable for
5+
downstream research and analysis capabilities. Converts raw search
6+
result objects (url, title, snippet) into structured corpus items
7+
with source_ref for lazy content resolution. Supports quick mode
8+
(snippet as content) and deep mode (source_ref for full page fetch).
9+
10+
inputs:
11+
results:
12+
type: array
13+
required: true
14+
description: >
15+
Web search results. Each result should contain at minimum a url field.
16+
May also include title, snippet, rank, domain, date.
17+
mode:
18+
type: string
19+
required: false
20+
description: >
21+
Processing mode. "quick" uses snippets as content (default).
22+
"deep" leaves content empty and sets source_ref for downstream
23+
resolution via research.source.retrieve.
24+
25+
outputs:
26+
items:
27+
type: array
28+
required: true
29+
description: >
30+
Normalized corpus items. Each item contains: id, title, content
31+
(populated in quick mode, empty in deep mode), source_ref
32+
(always present, type=url), type (web_page), source (original URL),
33+
metadata (original snippet, rank, domain).
34+
35+
properties:
36+
deterministic: true
37+
side_effects: false
38+
idempotent: true
39+
40+
metadata:
41+
status: experimental
42+
tags: [web, normalization, corpus, preprocessing]
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
id: web.search
1+
id: web.source.search
22
version: 1.0.0
3-
description: Search the web for results matching a query.
3+
description: Search the web for sources matching a query.
44
inputs:
55
query:
66
type: string

catalog/capabilities.json

Lines changed: 156 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -346,6 +346,67 @@
346346
"idempotent": true
347347
}
348348
},
349+
{
350+
"id": "analysis.theme.cluster",
351+
"version": "1.0.0",
352+
"description": "Group a collection of text items into coherent thematic clusters. Identifies dominant themes, assigns each item to one or more clusters, and produces a summary per cluster with signal strength. Accepts optional hint labels to guide (not force) the thematic structure.\n",
353+
"file": "capabilities/analysis.theme.cluster.yaml",
354+
"inputs": {
355+
"items": {
356+
"type": "array",
357+
"required": true,
358+
"description": "Items to cluster. Each item must include id and content (text). May also include title, type, source, metadata.\n"
359+
},
360+
"hint_labels": {
361+
"type": "array",
362+
"required": false,
363+
"description": "Suggested theme labels to guide clustering. The implementation may merge, split, rename, or add themes beyond hints. Example: [\"market_overview\", \"key_players\", \"trends\", \"risks\", \"opportunities\"].\n"
364+
},
365+
"max_clusters": {
366+
"type": "number",
367+
"required": false,
368+
"description": "Maximum number of clusters to produce (default 8, hard cap 15)."
369+
},
370+
"context": {
371+
"type": "string",
372+
"required": false,
373+
"description": "Background context informing how items should be grouped."
374+
}
375+
},
376+
"outputs": {
377+
"clusters": {
378+
"type": "array",
379+
"required": true,
380+
"description": "Themed clusters. Each cluster contains: theme (label), description, item_ids (list of assigned item ids), summary (text summary of the cluster content), signal_strength (0-1 estimate of how well-supported the theme is by the corpus).\n"
381+
},
382+
"unclustered": {
383+
"type": "array",
384+
"required": false,
385+
"description": "Items that did not fit any cluster. Each entry contains id and reason.\n"
386+
},
387+
"cluster_quality": {
388+
"type": "object",
389+
"required": false,
390+
"description": "Self-assessment of clustering quality: coherence_score (0-1), coverage_ratio (fraction of items assigned), overlap_warnings (list of items assigned to multiple clusters).\n"
391+
}
392+
},
393+
"metadata": {
394+
"tags": [
395+
"analysis",
396+
"clustering",
397+
"themes",
398+
"structuring"
399+
],
400+
"category": null,
401+
"status": "experimental",
402+
"examples": []
403+
},
404+
"properties": {
405+
"deterministic": false,
406+
"side_effects": false,
407+
"idempotent": true
408+
}
409+
},
349410
{
350411
"id": "audio.transcribe",
351412
"version": "1.0.0",
@@ -2874,6 +2935,57 @@
28742935
"idempotent": true
28752936
}
28762937
},
2938+
{
2939+
"id": "text.merge",
2940+
"version": "1.0.0",
2941+
"description": "Merge multiple text items into a single consolidated text block. Accepts an array of items with text content and produces a single string with configurable separator. Deterministic, no LLM required.\n",
2942+
"file": "capabilities/text.merge.yaml",
2943+
"inputs": {
2944+
"items": {
2945+
"type": "array",
2946+
"required": true,
2947+
"description": "Items to merge. Each item must include a content field (string). May also include id and title which are used as section headers when include_headers is true.\n"
2948+
},
2949+
"separator": {
2950+
"type": "string",
2951+
"required": false,
2952+
"description": "Separator between merged items. Defaults to double newline.\n"
2953+
},
2954+
"include_headers": {
2955+
"type": "boolean",
2956+
"required": false,
2957+
"description": "Whether to include item titles as section headers in the merged text. Defaults to true when items have titles.\n"
2958+
}
2959+
},
2960+
"outputs": {
2961+
"text": {
2962+
"type": "string",
2963+
"required": true,
2964+
"description": "Merged text block from all items."
2965+
},
2966+
"item_count": {
2967+
"type": "number",
2968+
"required": true,
2969+
"description": "Number of items that contributed content to the merged text."
2970+
}
2971+
},
2972+
"metadata": {
2973+
"tags": [
2974+
"text",
2975+
"merge",
2976+
"preprocessing",
2977+
"utility"
2978+
],
2979+
"category": null,
2980+
"status": "experimental",
2981+
"examples": []
2982+
},
2983+
"properties": {
2984+
"deterministic": true,
2985+
"side_effects": false,
2986+
"idempotent": true
2987+
}
2988+
},
28772989
{
28782990
"id": "text.summarize",
28792991
"version": "1.0.0",
@@ -3063,10 +3175,51 @@
30633175
}
30643176
},
30653177
{
3066-
"id": "web.search",
3178+
"id": "web.source.normalize",
3179+
"version": "1.0.0",
3180+
"description": "Normalize web search results into corpus item format suitable for downstream research and analysis capabilities. Converts raw search result objects (url, title, snippet) into structured corpus items with source_ref for lazy content resolution. Supports quick mode (snippet as content) and deep mode (source_ref for full page fetch).\n",
3181+
"file": "capabilities/web.source.normalize.yaml",
3182+
"inputs": {
3183+
"results": {
3184+
"type": "array",
3185+
"required": true,
3186+
"description": "Web search results. Each result should contain at minimum a url field. May also include title, snippet, rank, domain, date.\n"
3187+
},
3188+
"mode": {
3189+
"type": "string",
3190+
"required": false,
3191+
"description": "Processing mode. \"quick\" uses snippets as content (default). \"deep\" leaves content empty and sets source_ref for downstream resolution via research.source.retrieve.\n"
3192+
}
3193+
},
3194+
"outputs": {
3195+
"items": {
3196+
"type": "array",
3197+
"required": true,
3198+
"description": "Normalized corpus items. Each item contains: id, title, content (populated in quick mode, empty in deep mode), source_ref (always present, type=url), type (web_page), source (original URL), metadata (original snippet, rank, domain).\n"
3199+
}
3200+
},
3201+
"metadata": {
3202+
"tags": [
3203+
"web",
3204+
"normalization",
3205+
"corpus",
3206+
"preprocessing"
3207+
],
3208+
"category": null,
3209+
"status": "experimental",
3210+
"examples": []
3211+
},
3212+
"properties": {
3213+
"deterministic": true,
3214+
"side_effects": false,
3215+
"idempotent": true
3216+
}
3217+
},
3218+
{
3219+
"id": "web.source.search",
30673220
"version": "1.0.0",
3068-
"description": "Search the web for results matching a query.",
3069-
"file": "capabilities/web.search.yaml",
3221+
"description": "Search the web for sources matching a query.",
3222+
"file": "capabilities/web.source.search.yaml",
30703223
"inputs": {
30713224
"query": {
30723225
"type": "string",

0 commit comments

Comments
 (0)