Skip to content

Commit 22acfc0

Browse files
v0.7.3
# v0.7.3 — OAuth Stability, Background Tasks & Title Generation --- ## Features - **Background task UI** — tool cards now show a spinner while background tasks (Agent, Task) are running, giving clear visual feedback on active work (20352706) - **Improved title generation** — session titles are now generated using spread message selection and language awareness, producing more accurate and natural titles across languages (835ea942, 55b7830c, 46975ac7) - **Exclude filter badges** — Alt/Option-click on filter items now adds them as exclusion filters, with a live badge preview when Alt is held (fce25e23, 8a2ebbee) - **Automation lifecycle tests** — `--validate-server` now includes automation lifecycle tests, catching regressions in the automation pipeline (19fe11e9) ## Improvements - **MCP schema conversion** — `oneOf`, `anyOf`, `allOf`, and nested objects are now correctly handled when converting MCP tool schemas, reducing "unexpected parameter" errors from providers. Partially addresses [#308](#308) (64ae9d69) - **Minimax preset split** — Minimax provider is now split into separate Global and CN (China) variants with correct regional endpoints. Fixes [#386](#386) (d626f732, 5ba42ca3) - **@file mention resolution** — file mentions in chat input are now wrapped in semantic markers so the agent can properly read and resolve them. Fixes [#293](#293) (15d20c1d) - **Auto-create labels** — labels referenced by automations are now auto-created if they don't exist, preventing silent failures (76306c0b) - **Multi-OS CI validation** — `validate-server` workflow now runs on a macOS + Windows + Linux matrix with fail-fast enabled (32f4c91e) - **Title generation language awareness** — titles better reflect the language of the conversation. Partially addresses [#286](#286) (835ea942, 1892b4c9) ## Bug Fixes - **Spurious OAuth re-authentication** — fixed a race condition where sources would trigger unnecessary re-auth flows after a successful token refresh, causing connection interruptions and duplicate auth prompts (b98a2b2d, e101301b, a3264b02, 76bb5388, d8632e30) - **MCP source disconnect on token refresh** — MCP sources now properly reconnect when an OAuth token is refreshed, instead of staying in a disconnected state (b1b515b2) - **Re-auth menu interaction lock** — fixed a race condition where the re-authentication menu could get stuck in a locked state, blocking further user interaction (34dfc91f) - **MCP transport race condition** — fixed a race on back-to-back SDK queries that could corrupt the MCP transport layer (f7d3f902) - **Agent/Task activities stuck running** — background task tool activities (Agent, Task) no longer get stuck in "running" state when the underlying task completes or errors (6d49ae82) - **Background task memory leak** — extracted tool helpers and fixed a memory leak in background task notification handling (6979d063) - **Preamble stripping regression** — iterative preamble stripping now handles edge cases in language sanitization and filters out low-signal content more reliably (1892b4c9) - **Duplicate import** — removed a duplicate `Message` import in `turn-utils` left over from merge of #238 (0aa7045f) ---
1 parent 2525f37 commit 22acfc0

File tree

65 files changed

+2290
-824
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+2290
-824
lines changed

.github/workflows/validate-server.yml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,12 @@ on:
55

66
jobs:
77
validate-server:
8-
runs-on: ubuntu-latest
8+
strategy:
9+
fail-fast: true
10+
matrix:
11+
os: [ubuntu-latest, macos-latest, windows-latest]
12+
runs-on: ${{ matrix.os }}
13+
name: validate-server (${{ matrix.os }})
914
timeout-minutes: 15
1015

1116
steps:

apps/cli/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@craft-agent/cli",
3-
"version": "0.7.2",
3+
"version": "0.7.3",
44
"description": "Terminal client for Craft Agent server",
55
"type": "module",
66
"main": "src/index.ts",

apps/cli/src/commands.test.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ import { getValidateSteps } from './index.ts'
236236
describe('getValidateSteps', () => {
237237
it('returns 21 steps', () => {
238238
const steps = getValidateSteps()
239-
expect(steps.length).toBe(21)
239+
expect(steps.length).toBe(27)
240240
})
241241

242242
it('first step is handshake', () => {

apps/cli/src/index.ts

Lines changed: 238 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ export interface CliArgs {
3030
outputFormat: string
3131
noCleanup: boolean
3232
noSpinner: boolean
33+
verbose: boolean
3334
serverEntry?: string
3435
workspaceDir?: string
3536
// LLM configuration
@@ -55,6 +56,7 @@ export function parseArgs(argv: string[]): CliArgs {
5556
let outputFormat = 'text'
5657
let noCleanup = false
5758
let noSpinner = false
59+
let verbose = false
5860
let serverEntry: string | undefined
5961
let workspaceDir: string | undefined
6062
let provider = ''
@@ -102,6 +104,10 @@ export function parseArgs(argv: string[]): CliArgs {
102104
case '--no-spinner':
103105
noSpinner = true
104106
break
107+
case '--verbose':
108+
case '-v':
109+
verbose = true
110+
break
105111
case '--server-entry':
106112
serverEntry = args[++i]
107113
break
@@ -148,7 +154,7 @@ export function parseArgs(argv: string[]): CliArgs {
148154
if (!apiKey) apiKey = process.env.LLM_API_KEY ?? ''
149155
if (!baseUrl) baseUrl = process.env.LLM_BASE_URL ?? ''
150156

151-
return { url, token, workspace, timeout, json, tlsCa, sendTimeout, command, rest, sources, mode, outputFormat, noCleanup, noSpinner, serverEntry, workspaceDir, provider, model, apiKey, baseUrl }
157+
return { url, token, workspace, timeout, json, tlsCa, sendTimeout, command, rest, sources, mode, outputFormat, noCleanup, noSpinner, verbose, serverEntry, workspaceDir, provider, model, apiKey, baseUrl }
152158
}
153159

154160
// ---------------------------------------------------------------------------
@@ -672,7 +678,7 @@ async function cmdValidate(args: CliArgs): Promise<void> {
672678
connectTimeout: args.timeout,
673679
})
674680
} else {
675-
server = await spawnLocalServer(args, { quiet: true })
681+
server = await spawnLocalServer(args, { quiet: !args.verbose })
676682
client = server.client
677683
}
678684

@@ -759,9 +765,49 @@ export interface ValidateContext {
759765
createdSessionId?: string
760766
createdSourceSlug?: string
761767
createdSkillSlug?: string
768+
createdAutomation?: boolean
769+
automationTestSessionId?: string
770+
automationName?: string
771+
createdLabelId?: string
772+
/** Backup of existing automations.json before overwrite (undefined = didn't exist) */
773+
automationsJsonBackup?: string | null
774+
/** Backup of existing automations-history.jsonl before overwrite (undefined = didn't exist) */
775+
automationsHistoryBackup?: string | null
762776
onEvent?: (ev: { type: string; [key: string]: unknown }) => void
763777
}
764778

779+
/** Minimal shapes for RPC responses used in validation steps. */
780+
interface ValidateStatus {
781+
id?: string
782+
label?: string
783+
}
784+
785+
interface ValidateSession {
786+
id: string
787+
name?: string
788+
labels?: string[]
789+
}
790+
791+
interface ValidateLabel {
792+
id?: string
793+
name?: string
794+
}
795+
796+
interface ValidateMessageBlock {
797+
type: string
798+
text?: string
799+
}
800+
801+
interface ValidateMessage {
802+
role: string
803+
content: string | ValidateMessageBlock[]
804+
}
805+
806+
interface ValidateMessagesResponse {
807+
messages?: ValidateMessage[]
808+
conversation?: ValidateMessage[]
809+
}
810+
765811
/**
766812
* Send a message and wait for streaming events.
767813
* Returns a summary of received event types.
@@ -821,6 +867,59 @@ async function waitForSendEvents(
821867
}
822868
}
823869

870+
/**
871+
* Clean up automation test artifacts (config files, session, label).
872+
* Shared between the automation:cleanup test step and runValidation error recovery.
873+
*/
874+
async function cleanupAutomationArtifacts(
875+
client: CliRpcClient,
876+
ctx: ValidateContext,
877+
): Promise<string[]> {
878+
const cleaned: string[] = []
879+
880+
// Restore or remove automation config files
881+
if (ctx.workspaceRootPath && ctx.createdAutomation) {
882+
try {
883+
const { writeFile, unlink } = await import('fs/promises')
884+
const configPath = `${ctx.workspaceRootPath}/automations.json`
885+
const historyPath = `${ctx.workspaceRootPath}/automations-history.jsonl`
886+
if (ctx.automationsJsonBackup != null) {
887+
await writeFile(configPath, ctx.automationsJsonBackup).catch(() => {})
888+
cleaned.push('automations.json (restored)')
889+
} else {
890+
await unlink(configPath).catch(() => {})
891+
cleaned.push('automations.json (removed)')
892+
}
893+
if (ctx.automationsHistoryBackup != null) {
894+
await writeFile(historyPath, ctx.automationsHistoryBackup).catch(() => {})
895+
} else {
896+
await unlink(historyPath).catch(() => {})
897+
}
898+
ctx.createdAutomation = false
899+
} catch { /* best effort */ }
900+
}
901+
902+
// Delete automation-triggered session
903+
if (ctx.automationTestSessionId && client.isConnected) {
904+
try {
905+
await client.invoke('sessions:delete', ctx.automationTestSessionId)
906+
cleaned.push(`session ${ctx.automationTestSessionId}`)
907+
ctx.automationTestSessionId = undefined
908+
} catch { /* best effort */ }
909+
}
910+
911+
// Delete test label
912+
if (ctx.workspaceId && ctx.createdLabelId && client.isConnected) {
913+
try {
914+
await client.invoke('labels:delete', ctx.workspaceId, ctx.createdLabelId)
915+
cleaned.push(`label ${ctx.createdLabelId}`)
916+
ctx.createdLabelId = undefined
917+
} catch { /* best effort */ }
918+
}
919+
920+
return cleaned
921+
}
922+
824923
export function getValidateSteps(): ValidateStep[] {
825924
return [
826925
{
@@ -1034,6 +1133,139 @@ SKILLEOF`, 90_000, true, undefined, ctx.onEvent)
10341133
return `deleted skill: ${ctx.createdSkillSlug}`
10351134
},
10361135
},
1136+
// ----- Automation lifecycle -----
1137+
{
1138+
name: 'automation:create',
1139+
fn: async (client, ctx) => {
1140+
if (!ctx.createdSessionId || !ctx.workspaceRootPath) return 'skipped (no session or workspace)'
1141+
ctx.automationName = `CLI Validate Automation ${Date.now()}`
1142+
const configPath = `${ctx.workspaceRootPath}/automations.json`
1143+
const historyPath = `${ctx.workspaceRootPath}/automations-history.jsonl`
1144+
// Backup existing files before overwriting (protects real workspace data)
1145+
const { readFile } = await import('fs/promises')
1146+
ctx.automationsJsonBackup = await readFile(configPath, 'utf-8').catch(() => null)
1147+
ctx.automationsHistoryBackup = await readFile(historyPath, 'utf-8').catch(() => null)
1148+
const config = JSON.stringify({
1149+
version: 2,
1150+
automations: {
1151+
SessionStatusChange: [{
1152+
name: ctx.automationName,
1153+
matcher: 'in-progress',
1154+
labels: ['cli-validate-label'],
1155+
actions: [{ type: 'prompt', prompt: 'Reply with exactly: AUTOMATION_TRIGGERED' }],
1156+
}],
1157+
},
1158+
}, null, 2)
1159+
return await waitForSendEvents(client, ctx.createdSessionId,
1160+
`Use the Bash tool to run this exact command:
1161+
cat > "${configPath}" << 'AUTOMATIONEOF'
1162+
${config}
1163+
AUTOMATIONEOF`, 90_000, true, undefined, ctx.onEvent)
1164+
.then((r) => { ctx.createdAutomation = true; return r })
1165+
},
1166+
},
1167+
{
1168+
name: 'automation:trigger (status change)',
1169+
fn: async (client, ctx) => {
1170+
if (!ctx.createdSessionId || !ctx.workspaceId) return 'skipped (no session or workspace)'
1171+
// Get available statuses to find one containing "in-progress"
1172+
const statuses = (await client.invoke('statuses:list', ctx.workspaceId)) as ValidateStatus[]
1173+
const inProgress = statuses?.find((s) =>
1174+
(s.id ?? '').toLowerCase().includes('in-progress') ||
1175+
(s.label ?? '').toLowerCase().includes('in progress')
1176+
)
1177+
const statusValue = inProgress?.id ?? 'in-progress'
1178+
1179+
// Change session status to trigger the automation
1180+
await client.invoke('sessions:command', ctx.createdSessionId, {
1181+
type: 'setSessionStatus',
1182+
state: statusValue,
1183+
})
1184+
1185+
// Poll for the automation-created session (automation fires asynchronously)
1186+
let delay = 1000
1187+
const deadline = Date.now() + 60_000
1188+
while (Date.now() < deadline) {
1189+
await new Promise((r) => setTimeout(r, delay))
1190+
delay = Math.min(delay * 1.5, 10_000)
1191+
const sessions = (await client.invoke('sessions:get', ctx.workspaceId)) as ValidateSession[]
1192+
const automationSession = sessions?.find((s) =>
1193+
s.name === ctx.automationName && s.id !== ctx.createdSessionId
1194+
)
1195+
if (automationSession) {
1196+
ctx.automationTestSessionId = automationSession.id
1197+
return `triggered → session ${automationSession.id} (status=${statusValue})`
1198+
}
1199+
}
1200+
throw new Error('Automation-created session not found within 60s')
1201+
},
1202+
},
1203+
{
1204+
name: 'automation:verify session',
1205+
fn: async (client, ctx) => {
1206+
if (!ctx.automationTestSessionId) return 'skipped (no automation session)'
1207+
// Wait for the automation session to complete
1208+
let delay = 1000
1209+
const deadline = Date.now() + 90_000
1210+
while (Date.now() < deadline) {
1211+
const session = (await client.invoke('sessions:getMessages', ctx.automationTestSessionId)) as ValidateMessagesResponse
1212+
const messages = session?.messages ?? session?.conversation ?? []
1213+
const hasAssistant = messages.some((m) => m.role === 'assistant')
1214+
if (hasAssistant) {
1215+
const lastAssistant = [...messages].reverse().find((m) => m.role === 'assistant')
1216+
const text = typeof lastAssistant?.content === 'string'
1217+
? lastAssistant.content
1218+
: Array.isArray(lastAssistant?.content)
1219+
? lastAssistant.content.filter((b) => b.type === 'text').map((b) => b.text ?? '').join(' ')
1220+
: ''
1221+
return `session has assistant response (${text.slice(0, 80).trim()})`
1222+
}
1223+
await new Promise((r) => setTimeout(r, delay))
1224+
delay = Math.min(delay * 1.5, 10_000)
1225+
}
1226+
throw new Error('Automation session did not complete within 90s')
1227+
},
1228+
},
1229+
{
1230+
name: 'automation:verify labels',
1231+
fn: async (client, ctx) => {
1232+
if (!ctx.automationTestSessionId || !ctx.workspaceId) return 'skipped (no automation session)'
1233+
// Verify label was auto-created
1234+
const labels = (await client.invoke('labels:list', ctx.workspaceId)) as ValidateLabel[]
1235+
const found = labels?.find((l) => (l.id ?? l.name ?? '') === 'cli-validate-label')
1236+
if (!found) throw new Error('Label cli-validate-label was not auto-created')
1237+
ctx.createdLabelId = found.id ?? 'cli-validate-label'
1238+
1239+
// Verify the automation session has the label
1240+
const sessions = (await client.invoke('sessions:get', ctx.workspaceId)) as ValidateSession[]
1241+
const automationSession = sessions?.find((s) => s.id === ctx.automationTestSessionId)
1242+
const sessionLabels: string[] = automationSession?.labels ?? []
1243+
const hasLabel = sessionLabels.some((l: string) => l.includes('cli-validate-label'))
1244+
if (!hasLabel) throw new Error(`Automation session missing label (has: ${sessionLabels.join(', ')})`)
1245+
return `label created and assigned: ${ctx.createdLabelId}`
1246+
},
1247+
},
1248+
{
1249+
name: 'automations:getLastExecuted',
1250+
fn: async (client, ctx) => {
1251+
if (!ctx.workspaceId) return 'skipped (no workspace)'
1252+
const history = (await client.invoke('automations:getLastExecuted', ctx.workspaceId)) as Record<string, number>
1253+
const entries = Object.entries(history)
1254+
if (entries.length === 0) throw new Error('No automation execution history found')
1255+
// Verify at least one automation ran recently (within last 2 minutes)
1256+
const recentThreshold = Date.now() - 120_000
1257+
const recent = entries.find(([, ts]) => ts > recentThreshold)
1258+
if (!recent) throw new Error(`No recent automation execution (latest: ${Math.max(...entries.map(([, ts]) => ts))})`)
1259+
return `${entries.length} automation(s), latest ran ${Math.round((Date.now() - recent[1]) / 1000)}s ago`
1260+
},
1261+
},
1262+
{
1263+
name: 'automation:cleanup',
1264+
fn: async (client, ctx) => {
1265+
const cleaned = await cleanupAutomationArtifacts(client, ctx)
1266+
return cleaned.length > 0 ? `cleaned: ${cleaned.join(', ')}` : 'nothing to clean'
1267+
},
1268+
},
10371269
{
10381270
name: 'sources:delete',
10391271
fn: async (client, ctx) => {
@@ -1186,6 +1418,9 @@ export async function runValidation(client: CliRpcClient, jsonMode: boolean, noS
11861418
}
11871419
}
11881420

1421+
// Cleanup: automation artifacts
1422+
await cleanupAutomationArtifacts(client, ctx)
1423+
11891424
// Cleanup: if we auto-created a temp workspace, remove it
11901425
if (ctx.createdWorkspace && ctx.workspaceId && client.isConnected) {
11911426
try {
@@ -1265,6 +1500,7 @@ Commands:
12651500
invoke <channel> [...] Raw RPC call with JSON args
12661501
listen <channel> Subscribe to push events (Ctrl+C to stop)
12671502
--validate-server Multi-step server integration test
1503+
--verbose, -v Show server stderr output
12681504
12691505
Examples:
12701506
craft-cli run "What files are in the current directory?"

apps/cli/src/server-spawner.ts

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,8 @@ export async function spawnServer(opts?: SpawnServerOptions): Promise<SpawnedSer
7272
// Pipe server stderr to our stderr so --debug logs are visible (unless quiet)
7373
if (proc.stderr && !opts?.quiet) {
7474
;(async () => {
75-
const reader = proc.stderr!.getReader()
75+
// @ts-expect-error — Bun Subprocess types don't narrow stderr to ReadableStream when stderr: 'pipe'
76+
const reader = proc.stderr.getReader()
7677
try {
7778
while (true) {
7879
const { done, value } = await reader.read()
@@ -122,7 +123,8 @@ export async function spawnServer(opts?: SpawnServerOptions): Promise<SpawnedSer
122123
}
123124

124125
;(async () => {
125-
const reader = proc.stdout!.getReader()
126+
// @ts-expect-error — Bun Subprocess types don't narrow stdout to ReadableStream when stdout: 'pipe'
127+
const reader = proc.stdout.getReader()
126128
const decoder = new TextDecoder()
127129
try {
128130
while (true) {

0 commit comments

Comments
 (0)