You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add `taxonomy` manifest field with `vector_collapse` option for collapsing content-equivalent search results across variant axes (e.g. the same API operation documented in multiple SDK languages). At search time, results sharing the same content identity are collapsed to the highest-scoring variant. On a realistic 30MB multi-language corpus this improved facet precision by 27%, MRR@5 by 10%, and NDCG@5 by 15%.
Copy file name to clipboardExpand all lines: packages/core/src/manifest-schema.ts
+19Lines changed: 19 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -51,6 +51,23 @@ export const ManifestOverrideSchema = z
51
51
"Overrides the default chunking strategy and/or metadata for files matching a glob pattern. Within the overrides array, later matches take precedence."
52
52
);
53
53
54
+
exportconstTaxonomyFieldConfigSchema=z
55
+
.object({
56
+
vector_collapse: z
57
+
.boolean()
58
+
.default(false)
59
+
.describe(
60
+
"When true, this taxonomy dimension identifies content variants that are near-identical in vector space (e.g. the same API operation documented in multiple SDK languages). At search time, results sharing the same content identity — determined by normalizing this field's value out of the filepath — are collapsed to the highest-scoring result. Has no effect when a filter for this field is active, since the filter already restricts to a single value."
61
+
),
62
+
})
63
+
.describe("Configuration for a taxonomy field's search-time behavior.");
64
+
65
+
exportconstManifestTaxonomyConfigSchema=z
66
+
.record(z.string(),TaxonomyFieldConfigSchema)
67
+
.describe(
68
+
"Per-field configuration for taxonomy dimensions. Controls search-time behavior such as cross-language result collapsing."
69
+
);
70
+
54
71
exportconstManifestSchema=z
55
72
.object({
56
73
version: z
@@ -67,6 +84,8 @@ export const ManifestSchema = z
67
84
"Key-value pairs attached to every chunk produced from this directory tree. Each key becomes a filterable taxonomy dimension exposed as an enum parameter on the search tool."
0 commit comments