Skip to content

Commit 03db13f

Browse files
authored
Merge pull request #73 from gmickel/feat/fn-40-structured-query-documents
feat: add structured query documents
2 parents d8b4e97 + 88ee900 commit 03db13f

24 files changed

+960
-45
lines changed

.flow/tasks/fn-40-structured-query-document-syntax.1.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
"id": "fn-40-structured-query-document-syntax.1",
99
"priority": 1,
1010
"spec_path": ".flow/tasks/fn-40-structured-query-document-syntax.1.md",
11-
"status": "todo",
11+
"status": "done",
1212
"title": "Design and implement first-class structured query documents",
13-
"updated_at": "2026-03-10T15:34:15.453535Z"
13+
"updated_at": "2026-03-10T18:10:00.000000Z"
1414
}

.flow/tasks/fn-40-structured-query-document-syntax.1.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,17 @@ Design and implement first-class multi-line structured query documents using exi
2020

2121
## Done summary
2222

23-
TBD
23+
Implemented first-class structured multi-line query documents using GNO naming only: `term`, `intent`, and `hyde`. Added a shared parser/normalizer, rolled it through CLI `query`/`ask`, REST `query`/`ask`, MCP `gno_query`, SDK `query`/`ask`, and Web Search/Ask text boxes, then added parser/CLI/API/SDK coverage plus full docs/website updates including a dedicated syntax reference page.
24+
25+
Key decisions:
26+
27+
- structured syntax only activates for multi-line query input, so single-line queries remain unchanged
28+
- plain untyped lines become the base query; if absent, GNO derives the base query from `term:` lines first, then `intent:` lines
29+
- `hyde:` is never searched directly and hyde-only documents are rejected
30+
- explicit `queryModes` and document-derived modes merge, with shared validation across the combined set
2431

2532
## Evidence
2633

2734
- Commits:
28-
- Tests:
35+
- Tests: bun run lint:check, bun test, bun run docs:verify, cd website && mise x -- make build, bun test test/core/structured-query.test.ts test/serve/routes/query.test.ts test/sdk/client.test.ts test/cli/structured-query-document.test.ts --timeout 60000
2936
- PRs:

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
## [0.24.0] - 2026-03-10
13+
14+
### Added
15+
16+
- First-class structured multi-line query documents using `term:`, `intent:`, and `hyde:` across CLI `query`/`ask`, REST `/api/query` and `/api/ask`, MCP `gno_query`, SDK `query`/`ask`, and Web Search/Ask text boxes.
17+
- Dedicated structured syntax reference doc plus updated CLI/API/MCP/SDK/Web docs.
18+
1219
## [0.23.0] - 2026-03-10
1320

1421
### Added

README.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,13 @@ GNO is a local knowledge engine that turns your documents into a searchable, con
3434

3535
---
3636

37-
## What's New in v0.23
37+
## What's New in v0.24
38+
39+
- **Structured Query Documents**: first-class multi-line query syntax using `term:`, `intent:`, and `hyde:`
40+
- **Cross-Surface Rollout**: works across CLI, API, MCP, SDK, and Web Search/Ask
41+
- **Portable Retrieval Prompts**: save/share advanced retrieval intent as one text payload instead of repeated flags or JSON arrays
42+
43+
### v0.23
3844

3945
- **SDK / Library Mode**: package-root importable SDK with `createGnoClient(...)` for direct retrieval, document access, and indexing flows
4046
- **Inline Config Support**: embed GNO in another app without writing YAML config files
@@ -288,11 +294,15 @@ gno query "auth flow" \
288294
--query-mode intent:"how refresh token rotation works" \
289295
--query-mode hyde:"Refresh tokens rotate on each use and previous tokens are revoked." \
290296
--explain
297+
298+
# Multi-line structured query document
299+
gno query $'auth flow\nterm: "refresh token" -oauth1\nintent: how refresh token rotation works\nhyde: Refresh tokens rotate on each use and previous tokens are revoked.' --fast
291300
```
292301

293302
- Modes: `term` (BM25-focused), `intent` (semantic-focused), `hyde` (single hypothetical passage)
294303
- Explain includes stage timings, fallback/cache counters, and per-result score components
295304
- `gno ask --json` includes `meta.answerContext` for adaptive source selection traces
305+
- Search and Ask web text boxes also accept multi-line structured query documents with `Shift+Enter`
296306

297307
---
298308

docs/API.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1197,6 +1197,7 @@ Combined BM25 + vector search with optional reranking. **Recommended for best re
11971197
- `intent` is orthogonal to `queryModes`: intent steers scoring/prompting, while query modes inject caller-provided retrieval expansions.
11981198
- `queryModes` is optional and only needed for explicit retrieval intent control.
11991199
- If `queryModes` is provided, generated expansion is skipped and provided entries are used directly.
1200+
- `query` can also be a multi-line structured query document using `term:`, `intent:`, and `hyde:` lines. See [Structured Query Syntax](./SYNTAX.md).
12001201

12011202
**Response**:
12021203

@@ -1234,6 +1235,10 @@ Combined BM25 + vector search with optional reranking. **Recommended for best re
12341235
curl -X POST http://localhost:3000/api/query \
12351236
-H "Content-Type: application/json" \
12361237
-d '{"query": "error handling best practices", "limit": 10}'
1238+
1239+
curl -X POST http://localhost:3000/api/query \
1240+
-H "Content-Type: application/json" \
1241+
-d '{"query": "auth flow\nterm: \"refresh token\"\nintent: token rotation"}'
12371242
```
12381243

12391244
---
@@ -1298,6 +1303,7 @@ Get an AI-generated answer with citations from your documents.
12981303
- Existing `/api/ask` payloads remain valid.
12991304
- `queryModes` is optional and only needed for explicit retrieval steering during Q&A.
13001305
- If `queryModes` is provided, generated expansion is skipped and provided entries are used directly.
1306+
- `query` can also be a multi-line structured query document using `term:`, `intent:`, and `hyde:` lines. See [Structured Query Syntax](./SYNTAX.md).
13011307

13021308
**Response**:
13031309

docs/CLI.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ gno query "auth" --thorough # Full pipeline: ~5-8s
117117
gno query "auth" --tags-all work,backend # Filter by tags
118118
gno query "performance" --intent "web performance and latency"
119119
gno query "auth flow" --query-mode term:"jwt refresh token" --query-mode intent:"how refresh token rotation works"
120+
gno query $'auth flow\nterm: "refresh token"\nintent: token rotation'
120121
```
121122

122123
**Search modes**:
@@ -157,6 +158,7 @@ Additional options:
157158
- `--intent` is orthogonal to `--query-mode`: intent steers scoring/prompting, while query modes inject caller-provided retrieval expansions.
158159
- `--query-mode` is opt-in for explicit intent control and replaces generated expansion for that query.
159160
- Use `term` for exact lexical constraints, `intent` for semantic reformulations, and `hyde` for one hypothetical answer passage.
161+
- Multi-line structured query documents are also supported. See [Structured Query Syntax](./SYNTAX.md).
160162

161163
```bash
162164
# Existing call (still valid)
@@ -167,6 +169,9 @@ gno query "auth flow" \
167169
--query-mode term:"jwt refresh token -oauth1" \
168170
--query-mode intent:"how refresh token rotation works" \
169171
--query-mode hyde:"Refresh tokens rotate on each use and previous tokens are revoked."
172+
173+
# Multi-line structured query document
174+
gno query $'auth flow\nterm: "refresh token" -oauth1\nintent: how refresh token rotation works\nhyde: Refresh tokens rotate on each use and previous tokens are revoked.'
170175
```
171176

172177
The `--explain` flag outputs:
@@ -194,6 +199,7 @@ gno ask "quick lookup" --fast # Fastest retrieval
194199
gno ask "complex topic" --thorough # Best recall
195200
gno ask "performance" --intent "web latency and vitals"
196201
gno ask "performance" --query-mode term:"web performance budgets" --query-mode intent:"latency and vitals" --no-answer
202+
gno ask $'term: web performance budgets\nintent: latency and vitals' --no-answer
197203
```
198204

199205
**Full-document context**: When `--answer` is used, GNO passes complete document content to the generation model, not truncated snippets. This ensures the LLM sees tables, code examples, and full context needed for accurate answers.
@@ -209,6 +215,7 @@ Options:
209215
- `--intent <text>` - Disambiguating context for ambiguous questions without searching on that text
210216
- `--exclude <values>` - Hard-prune docs containing any comma-separated term in title/path/body
211217
- `--query-mode <mode:text>` - Structured expansion hints; repeat for multiple entries. Modes: `term`, `intent`, `hyde`
218+
- Multi-line structured query documents are also supported. See [Structured Query Syntax](./SYNTAX.md).
212219
- `-C, --candidate-limit <n>` - Max candidates passed to reranking (default: 20)
213220
- `--answer` - Generate grounded AI answer (requires gen model)
214221
- `--no-answer` - Force retrieval-only output

docs/MCP.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -578,6 +578,7 @@ Optional steering controls:
578578
- `intent` is complementary to `queryModes`: use intent for background context, `queryModes` for caller-supplied lexical/semantic expansions.
579579
- `queryModes` is optional and only needed when your client wants explicit retrieval intent control.
580580
- If `queryModes` is set, generated expansion is skipped for that query and the provided entries are used directly.
581+
- The `query` string itself may also be a multi-line structured query document using `term:`, `intent:`, and `hyde:` lines. See [Structured Query Syntax](./SYNTAX.md).
581582

582583
```yaml
583584
# Existing payload (still valid)
@@ -590,6 +591,12 @@ queryModes:
590591
- { mode: "term", text: "\"refresh token\" -oauth1" }
591592
- { mode: "intent", text: "how token rotation is implemented" }
592593
- { mode: "hyde", text: "Refresh tokens rotate on each use and old tokens are revoked." }
594+
595+
# Or put the structure directly into the query field:
596+
query: |
597+
auth flow
598+
term: "refresh token" -oauth1
599+
intent: how token rotation is implemented
593600
```
594601
595602
### gno_get

docs/SDK.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,14 @@ const results = await client.query("performance", {
108108
noExpand: true,
109109
noRerank: true,
110110
});
111+
112+
const structured = await client.query(
113+
'auth flow\\nterm: "refresh token"\\nintent: token rotation',
114+
{
115+
noExpand: true,
116+
noRerank: true,
117+
}
118+
);
111119
```
112120

113121
### Ask
@@ -124,6 +132,15 @@ const retrievalOnly = await client.ask("JWT token", {
124132
const answered = await client.ask("What is our auth flow?", {
125133
answer: true,
126134
});
135+
136+
const retrievalOnlyStructured = await client.ask(
137+
"term: web performance budgets\\nintent: latency and vitals",
138+
{
139+
noAnswer: true,
140+
noExpand: true,
141+
noRerank: true,
142+
}
143+
);
127144
```
128145

129146
### Vector Search
@@ -211,12 +228,14 @@ The package root is the SDK entrypoint. The CLI remains available through the `g
211228
- `query` and `ask` degrade gracefully if vector/rerank/generation models are unavailable, except when answer generation is explicitly requested.
212229
- `vsearch` requires embeddings plus vector search support.
213230
- Inline config is supported; writing YAML is optional.
231+
- `query` and `ask` accept multi-line structured query documents. See [Structured Query Syntax](./SYNTAX.md).
214232

215233
---
216234

217235
## Related Docs
218236

219237
- [CLI](./CLI.md)
220238
- [REST API](./API.md)
239+
- [Structured Query Syntax](./SYNTAX.md)
221240
- [Architecture](./ARCHITECTURE.md)
222241
- [Configuration](./CONFIGURATION.md)

docs/SYNTAX.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Structured Query Syntax
2+
3+
GNO supports a first-class multi-line query document syntax for `query` and `ask` flows.
4+
5+
Use existing GNO naming only:
6+
7+
```text
8+
auth flow
9+
term: "refresh token" -oauth1
10+
intent: how token rotation works
11+
hyde: Refresh tokens rotate on each use and previous tokens are invalidated.
12+
```
13+
14+
## Rules
15+
16+
- Structured syntax is only activated for multi-line input.
17+
- Blank lines are ignored.
18+
- Recognized typed lines are:
19+
- `term:`
20+
- `intent:`
21+
- `hyde:`
22+
- At most one `hyde:` line is allowed.
23+
- Unknown typed prefixes like `vector:` are rejected.
24+
25+
## Base Query
26+
27+
The query document still needs a base search query.
28+
29+
GNO resolves it in this order:
30+
31+
1. plain untyped lines joined together
32+
2. otherwise all `term:` lines joined together
33+
3. otherwise all `intent:` lines joined together
34+
35+
`hyde:` is never searched directly.
36+
37+
That means these are both valid:
38+
39+
```text
40+
auth flow
41+
term: "refresh token"
42+
intent: token rotation
43+
```
44+
45+
```text
46+
term: "refresh token"
47+
intent: token rotation
48+
```
49+
50+
This is invalid:
51+
52+
```text
53+
hyde: hypothetical answer only
54+
```
55+
56+
## Compatibility
57+
58+
Structured query documents are additive:
59+
60+
- existing plain single-line queries still work
61+
- existing `--query-mode` CLI flags still work
62+
- existing API/MCP `queryModes` arrays still work
63+
64+
If both are supplied, GNO merges:
65+
66+
- query-document typed lines
67+
- explicit `queryModes`
68+
69+
Validation still applies across the combined set, including the single-`hyde` rule.
70+
71+
## Supported Surfaces
72+
73+
Current rollout:
74+
75+
- CLI: `gno query`, `gno ask`
76+
- REST API: `/api/query`, `/api/ask`
77+
- MCP: `gno_query`
78+
- Web UI: Search and Ask text boxes
79+
- SDK: `client.query(...)`, `client.ask(...)`
80+
81+
## Examples
82+
83+
### CLI
84+
85+
```bash
86+
gno query $'auth flow\nterm: "refresh token"\nintent: token rotation'
87+
```
88+
89+
```bash
90+
gno ask $'term: web performance budgets\nintent: latency and vitals' --no-answer
91+
```
92+
93+
### REST API
94+
95+
```json
96+
{
97+
"query": "auth flow\nterm: \"refresh token\"\nintent: token rotation"
98+
}
99+
```
100+
101+
### SDK
102+
103+
```ts
104+
const result = await client.query(
105+
'auth flow\\nterm: "refresh token"\\nintent: token rotation'
106+
);
107+
```

docs/WEB-UI.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,15 @@ Choose retrieval mode:
8181

8282
Click **Ask** for AI-powered answers. Use **Advanced Retrieval** to scope by collection/date/category/author/tags and add optional `intent` / candidate-limit / exclude / query-mode controls for ambiguous questions.
8383

84+
Both **Search** and **Ask** accept multi-line structured query documents. Press `Shift+Enter` to add a new line, then use:
85+
86+
```text
87+
auth flow
88+
term: "refresh token"
89+
intent: token rotation
90+
hyde: Refresh tokens rotate on each use.
91+
```
92+
8493
> **Note**: Models auto-download on first use. Cold start can take longer on first launch while local models download. For instant startup, set `GNO_NO_AUTO_DOWNLOAD=1` and download explicitly with `gno models pull`.
8594
8695
---

0 commit comments

Comments
 (0)