Skip to content

Commit f0cd3ac

Browse files
jqnatividadclaude
andauthored
feat(mcp): comprehensive logging (#3555)
* feat(mcp): add qsv_log core tool for agent-initiated reproducibility logging Enable agents to write structured entries (user_prompt, agent_reasoning, agent_action, result_summary, note) to the qsv audit log (qsvmcp.log) with u- prefixed UUIDs, distinct from automatic s-/e- audit entries. Automatic audit logging is skipped for qsv_log calls to avoid recursion. Messages are truncated at 4096 chars and logging failures never break the workflow. Server instructions updated to guide agents on when/how to log for third-party reproducibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address review findings (job 619) All 421 tests pass, 0 failures. The change is correct. Changes: - Fix Unicode truncation fast-path to use UTF-16 length as a cheap guard (strings shorter in UTF-16 are guaranteed shorter in codepoints), only performing expensive `Array.from()` codepoint conversion when the string exceeds the limit Address review findings (job 606) All 72 tests pass (0 failures), including the 3 new tests for missing params. Changes: - Check for `null`/`undefined` params explicitly before string coercion in `handleLogCall`, returning clear "is required" error messages (finding #1) - Trim and strip newlines from log messages before writing, preventing multi-line log entries and inconsistent whitespace (findings #2, #3) - Added tests for missing `entry_type`, missing `message`, and entirely empty params (finding #5) Address review findings (job 607) All 418 tests pass, including the new one. Changes: - Add test for newline-only message (`'\n\n'`) confirming it's rejected as non-empty string Address review findings (job 609) All 74 tests pass (0 failures), including all the new and existing `handleLogCall` tests. Changes: - Log `catch` block now writes error details to stderr via `console.error` instead of silently swallowing - Added `--` separator before the message argument in `qsv log` CLI call to prevent messages starting with `-` from being misinterpreted as flags - Documented newline collapsing behavior in the tool description ("Newlines are collapsed to spaces") - Added test for non-string type coercion (`{ entry_type: 123, message: true }`) confirming `String()` coercion behavior Address review findings (job 610) All 420 tests pass. Changes: - Include truncated error message in the success result returned to the agent (not just stderr), so the agent has actionable context when `qsv_log` write fails - Add test for non-string message coercion with valid `entry_type` to verify `String()` coercion works for the message path Address review findings (job 611) All 420 tests pass. The Rust diagnostics are pre-existing and unrelated to this change. Changes: - Added `assert.ok(!result.isError)` to the `handleLogCall` non-string message coercion test to explicitly verify the result is not an error, making the test intent clearer Address review findings (job 613) All 420 tests pass, 0 failures. The changes are verified. Changes: - Added comment on `--` separator in `handleLogCall` args explaining it guards against messages starting with `-` being parsed as flags (addresses medium finding) - Added `config.qsvValidation.valid` skip guard to `handleLogCall coerces non-string message` test so it properly tests the success path instead of passing accidentally via error swallowing (addresses low finding #4) - Added assertion that success response doesn't contain "warning" to confirm actual success vs swallowed error Address review findings (job 615) No CLAUDE.md changes needed for the `--` removal. All changes are complete and tests pass. Changes: - Remove unnecessary `--` end-of-options sentinel from `qsv log` args — `qsv log` uses docopt variadic `[<message>...]` which handles this correctly, and messages always start with `[entry_type]` so they can never be misinterpreted as flags - Fix Unicode-safe truncation using `Array.from()` instead of `String.slice()` to avoid splitting surrogate pairs in non-ASCII messages - Add throttling guidance to server instructions ("Avoid excessive logging — for simple interactions, a single user_prompt + result_summary pair is enough") - Add test for the `handleLogCall` error-swallowing catch path using a non-existent working directory Address review findings (job 616) The change looks correct. The length check and truncation now both operate on codepoints consistently. Changes: - Fix Unicode truncation length mismatch: use codepoint count (`Array.from(sanitized).length`) for both the gate condition and the truncation, avoiding inconsistency between UTF-16 `.length` and codepoint-aware `Array.from().slice()` Address review findings (job 618) All 421 tests pass, 0 failures. All `handleLogCall` tests pass including the updated write-failure test. Changes: - Reworded catch-path message from misleading `"Logged ... (warning: write failed: ...)"` to clearer `"Log write failed (non-fatal): ... Workflow continues."` (issue 1) - Added fast-path optimization for Unicode truncation: only call `Array.from()` when `sanitized.length > MAX_LOG_MESSAGE_LEN`, avoiding unnecessary codepoint conversion on short messages (issue 3) - Updated test assertions to match the new error message wording * fix(mcp): address Copilot review findings for qsv_log - Move skipAuditLog from "Key Constants" to a behavior note in CLAUDE.md (it's a local variable, not a module-level constant) - Reorder enum and LOG_ENTRY_TYPES Set to match description order (reasoning before action) - Add unique temp dir + cleanup to coercion test to prevent log file accumulation in OS temp root Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 83e3ef9 commit f0cd3ac

File tree

6 files changed

+369
-10
lines changed

6 files changed

+369
-10
lines changed

.claude/skills/CLAUDE.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ npm run mcpb:package
128128
- Auto-enables `--stats-jsonl` for stats command
129129
- Integrates update checker for background version monitoring
130130
- **Server instructions**: Provides cross-tool workflow guidance via MCP `initialize` response
131-
- **Deferred tool loading**: Only 9 core tools loaded initially (~80% token reduction)
131+
- **Deferred tool loading**: Only 10 core tools loaded initially (~80% token reduction)
132132
- **Environment-controlled exposure**: Use `QSV_MCP_EXPOSE_ALL_TOOLS=true` for all tools
133133
- **Roots auto-sync**: `syncWorkingDirFromRoots()` runs on startup and on `RootsListChangedNotification`; manual override via `qsv_set_working_dir`, passing `"auto"` re-enables sync
134134

@@ -137,6 +137,8 @@ npm run mcpb:package
137137
- `SHUTDOWN_TIMEOUT_MS`: 2000 — graceful shutdown timeout (ms)
138138
- `UPDATE_CHECK_TIMEOUT_MS`: 30_000 — background update check timeout (ms)
139139

140+
**Recursive logging prevention**: The `CallToolRequestSchema` handler sets a local `skipAuditLog` flag for `qsv_log` calls, guarding all three `s-`/`e-` audit log write points to prevent recursive noise.
141+
140142
**Key Functions**:
141143
```typescript
142144
server.setRequestHandler(ListToolsRequestSchema, async () => { ... })
@@ -160,6 +162,7 @@ server.setRequestHandler(ListResourcesRequestSchema, async () => { ... })
160162
- `COMMAND_GUIDANCE`: `Record<string, CommandGuidance>` — Unified per-command guidance map consolidating when-to-use, common patterns, error prevention, complementary servers, and memory/index/mistake warnings into a single structure
161163
- `LARGE_FILE_THRESHOLD_BYTES`: 10MB — files larger than this are auto-indexed (replaces `AUTO_INDEX_SIZE_MB`)
162164
- `MAX_MCP_RESPONSE_SIZE`: 850KB — responses exceeding this are saved to file instead of returned inline
165+
- `MAX_LOG_MESSAGE_LEN`: 4096 — max characters for `qsv_log` messages (truncated silently)
163166

164167
**Key Exported Functions**:
165168
- `isBinaryOutputFormat(commandName, params)` - Detect if command output is binary (parquet/arrow/avro)

.claude/skills/src/mcp-server.ts

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,11 @@ import {
3535
createListFilesTool,
3636
createSetWorkingDirTool,
3737
createGetWorkingDirTool,
38+
createLogTool,
3839
handleConfigTool,
3940
handleSearchToolsCall,
4041
handleToParquetCall,
42+
handleLogCall,
4143
initiateShutdown,
4244
killAllProcesses,
4345
getActiveProcessCount,
@@ -56,6 +58,7 @@ const CORE_TOOLS = [
5658
"qsv_set_working_dir",
5759
"qsv_get_working_dir",
5860
"qsv_list_files",
61+
"qsv_log",
5962
"qsv_command",
6063
"qsv_to_parquet",
6164
"qsv_index",
@@ -90,7 +93,9 @@ CACHE AWARENESS: Before running commands, check for existing caches to save time
9093
9194
MEMORY LIMITS: Commands dedup, sort, reverse, table, transpose, pragmastat load entire files into memory. For files >1GB, prefer extdedup/extsort alternatives via qsv_command. Check column cardinality with qsv_stats before running frequency or pivotp to avoid huge output.
9295
93-
OPERATION TIMEOUT: qsv operations can take significant time, especially on larger files. The MCP server's default operation timeout is 10 minutes (configurable via QSV_MCP_OPERATION_TIMEOUT_MS, max 30 minutes). Do NOT use a shorter client-side timeout — allow operations to run to completion or until the server's configured timeout. Check the current timeout setting with qsv_config. CONCURRENT OPERATIONS: Parallel tool calls are automatically queued. For optimal throughput in Claude Cowork, execute pipeline steps sequentially (index → stats → analysis).`;
96+
OPERATION TIMEOUT: qsv operations can take significant time, especially on larger files. The MCP server's default operation timeout is 10 minutes (configurable via QSV_MCP_OPERATION_TIMEOUT_MS, max 30 minutes). Do NOT use a shorter client-side timeout — allow operations to run to completion or until the server's configured timeout. Check the current timeout setting with qsv_config. CONCURRENT OPERATIONS: Parallel tool calls are automatically queued. For optimal throughput in Claude Cowork, execute pipeline steps sequentially (index → stats → analysis).
97+
98+
REPRODUCIBILITY LOG: Use qsv_log to create a verifiable audit trail. Log each user prompt (entry_type: "user_prompt") as it arrives, key decisions (entry_type: "agent_reasoning"), actions taken (entry_type: "agent_action"), and outcomes (entry_type: "result_summary"). The log file (qsvmcp.log) records both automatic tool invocations (s-/e- prefixed) and your explicit entries (u- prefixed). Keep entries concise but sufficient for a third party to reproduce your workflow. Avoid excessive logging — for simple interactions, a single user_prompt + result_summary pair is enough. Reserve agent_reasoning for non-obvious decisions.`;
9499

95100
/**
96101
* Resolved server instructions: uses custom instructions from
@@ -513,6 +518,9 @@ class QsvMcpServer {
513518
tools.push(createSetWorkingDirTool());
514519
tools.push(createGetWorkingDirTool());
515520

521+
// Add logging tool
522+
tools.push(createLogTool());
523+
516524
console.error(`[Server] Registered ${tools.length} tools`);
517525
if (!this.toolsListedOnce) {
518526
console.error(
@@ -566,7 +574,9 @@ class QsvMcpServer {
566574
// Start entries only at "info" level; "error" level only logs failures
567575
// Safe to capture once: config is immutable after initialization
568576
const auditLogEnabled = config.mcpLogLevel !== "off";
569-
if (config.mcpLogLevel === "info") {
577+
// Skip automatic audit logging for qsv_log to avoid recursive noise
578+
const skipAuditLog = name === "qsv_log";
579+
if (config.mcpLogLevel === "info" && !skipAuditLog) {
570580
runQsvSimple(config.qsvBinPath, ["log", name, `s-${invocationId}`, startMsg], {
571581
timeoutMs: 5_000,
572582
cwd: this.filesystemProvider.getWorkingDirectory(),
@@ -581,6 +591,11 @@ class QsvMcpServer {
581591
// Order: filesystem → generic command → config → search →
582592
// to_parquet → skill-based (qsv_*) → unknown tool error.
583593

594+
// Handle log tool (before filesystem tools so it's fast)
595+
if (name === "qsv_log") {
596+
return await handleLogCall(toolArgs || {}, this.filesystemProvider.getWorkingDirectory());
597+
}
598+
584599
// Handle filesystem tools
585600
if (name === "qsv_list_files") {
586601
const directory = typeof toolArgs?.directory === "string" ? toolArgs.directory : undefined;
@@ -705,7 +720,7 @@ class QsvMcpServer {
705720
// Log end with elapsed time
706721
const elapsedSecs = ((Date.now() - startTime) / 1000).toFixed(2);
707722
const isError = "isError" in result && result.isError === true;
708-
if (auditLogEnabled && (config.mcpLogLevel === "info" || isError)) {
723+
if (auditLogEnabled && !skipAuditLog && (config.mcpLogLevel === "info" || isError)) {
709724
const endMsg = isError
710725
? `error(${elapsedSecs}s): tool returned error`
711726
: `ok(${elapsedSecs}s)`;
@@ -718,7 +733,7 @@ class QsvMcpServer {
718733
return result;
719734
} catch (error: unknown) {
720735
// Log error end with elapsed time (always log errors unless fully off)
721-
if (auditLogEnabled) {
736+
if (auditLogEnabled && !skipAuditLog) {
722737
const elapsedSecs = ((Date.now() - startTime) / 1000).toFixed(2);
723738
const errMsg = getErrorMessage(error);
724739
const truncatedErr = errMsg.length > MAX_ARGS_LOG_LEN

.claude/skills/src/mcp-tools.ts

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,12 @@ const statOrNull = (path: string) =>
5252
*/
5353
const AUTO_INDEX_SIZE_MB = 10;
5454

55+
/**
56+
* Maximum length for qsv_log messages (in characters).
57+
* Messages exceeding this limit are silently truncated.
58+
*/
59+
export const MAX_LOG_MESSAGE_LEN = 4096;
60+
5561
/**
5662
* Commands that always return full CSV data and should use temp files
5763
*/
@@ -3256,3 +3262,128 @@ export async function handleToParquetCall(
32563262
return errorResult(`Error converting CSV to Parquet: ${getErrorMessage(error)}`);
32573263
}
32583264
}
3265+
3266+
// ============================================================================
3267+
// qsv_log — Agent-initiated reproducibility logging
3268+
// ============================================================================
3269+
3270+
/** Valid entry types for qsv_log */
3271+
const LOG_ENTRY_TYPES = new Set([
3272+
"user_prompt",
3273+
"agent_reasoning",
3274+
"agent_action",
3275+
"result_summary",
3276+
"note",
3277+
]);
3278+
3279+
/**
3280+
* Create the qsv_log tool definition.
3281+
*/
3282+
export function createLogTool(): McpToolDefinition {
3283+
return {
3284+
name: "qsv_log",
3285+
description: `Write a structured entry to the qsv audit log (qsvmcp.log) for reproducibility.
3286+
3287+
💡 USE WHEN:
3288+
- Logging the user's original prompt so a third party can reproduce the session
3289+
- Recording key reasoning or decisions that led to a particular tool choice
3290+
- Summarizing results after a workflow completes
3291+
3292+
📋 COMMON PATTERN:
3293+
1. Log "user_prompt" when a new user request arrives
3294+
2. Log "agent_reasoning" before complex decisions (e.g., choosing joinp over join)
3295+
3. Log "result_summary" after completing a workflow
3296+
3297+
📝 ENTRY TYPES:
3298+
- user_prompt — The user's original request (log once per prompt)
3299+
- agent_reasoning — Why you chose a particular approach
3300+
- agent_action — A significant action taken (beyond automatic audit logging)
3301+
- result_summary — Outcome of a completed workflow
3302+
- note — Free-form annotation
3303+
3304+
⚠️ CAUTION: Keep messages concise. Max ${MAX_LOG_MESSAGE_LEN} chars (truncated silently). Newlines are collapsed to spaces. Logging never fails the workflow.`,
3305+
inputSchema: {
3306+
type: "object",
3307+
properties: {
3308+
entry_type: {
3309+
type: "string",
3310+
enum: ["user_prompt", "agent_reasoning", "agent_action", "result_summary", "note"],
3311+
description: "Category of log entry.",
3312+
},
3313+
message: {
3314+
type: "string",
3315+
description: "The log message content.",
3316+
},
3317+
},
3318+
required: ["entry_type", "message"],
3319+
},
3320+
};
3321+
}
3322+
3323+
/**
3324+
* Handle a qsv_log tool invocation.
3325+
*
3326+
* Writes a `u-` prefixed entry to the qsv audit log via `qsv log`.
3327+
* Logging failures are swallowed — this tool should never break a workflow.
3328+
*/
3329+
export async function handleLogCall(
3330+
params: Record<string, unknown>,
3331+
workingDir: string,
3332+
): Promise<{ content: Array<{ type: string; text: string }>; isError?: boolean }> {
3333+
// Validate required params before coercing
3334+
if (params.entry_type == null) {
3335+
return errorResult("entry_type is required.");
3336+
}
3337+
if (params.message == null) {
3338+
return errorResult("message is required.");
3339+
}
3340+
3341+
const entryType = String(params.entry_type);
3342+
const rawMessage = String(params.message);
3343+
3344+
// Validate entry_type
3345+
if (!LOG_ENTRY_TYPES.has(entryType)) {
3346+
return errorResult(
3347+
`Invalid entry_type "${entryType}". Must be one of: ${[...LOG_ENTRY_TYPES].join(", ")}`,
3348+
);
3349+
}
3350+
3351+
// Validate message
3352+
if (rawMessage.trim().length === 0) {
3353+
return errorResult("message must be a non-empty string.");
3354+
}
3355+
3356+
// Trim, strip newlines, and truncate if needed (use Array.from for Unicode-safe truncation)
3357+
const sanitized = rawMessage.trim().replace(/[\r\n]+/g, " ");
3358+
// Fast path: if UTF-16 length is within limit, codepoint count is too
3359+
let message: string;
3360+
if (sanitized.length <= MAX_LOG_MESSAGE_LEN) {
3361+
message = sanitized;
3362+
} else {
3363+
const codepoints = Array.from(sanitized);
3364+
message =
3365+
codepoints.length > MAX_LOG_MESSAGE_LEN
3366+
? codepoints.slice(0, MAX_LOG_MESSAGE_LEN).join("")
3367+
: sanitized;
3368+
}
3369+
3370+
const logId = `u-${randomUUID()}`;
3371+
3372+
try {
3373+
await runQsvSimple(config.qsvBinPath, [
3374+
"log",
3375+
"qsv_log",
3376+
logId,
3377+
`[${entryType}] ${message}`,
3378+
], {
3379+
timeoutMs: 5_000,
3380+
cwd: workingDir,
3381+
});
3382+
} catch (err) {
3383+
const errMsg = getErrorMessage(err);
3384+
console.error(`[qsv_log] write failed: ${errMsg}`);
3385+
return successResult(`Log write failed (non-fatal): ${errMsg.slice(0, 100)}. Workflow continues.`);
3386+
}
3387+
3388+
return successResult(`Logged ${entryType} entry.`);
3389+
}

.claude/skills/tests/deferred-loading.test.ts

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ const CORE_TOOLS = [
2424
"qsv_set_working_dir",
2525
"qsv_get_working_dir",
2626
"qsv_list_files",
27+
"qsv_log",
2728
"qsv_command",
2829
"qsv_to_parquet",
2930
"qsv_index",
@@ -34,8 +35,8 @@ const CORE_TOOLS = [
3435
// Core Tools Count Verification
3536
// ============================================================================
3637

37-
test('CORE_TOOLS has exactly 9 tools', () => {
38-
assert.strictEqual(CORE_TOOLS.length, 9, 'Should have exactly 9 core tools');
38+
test('CORE_TOOLS has exactly 10 tools', () => {
39+
assert.strictEqual(CORE_TOOLS.length, 10, 'Should have exactly 10 core tools');
3940
});
4041

4142
test('CORE_TOOLS includes all required utility tools', () => {
@@ -45,6 +46,7 @@ test('CORE_TOOLS includes all required utility tools', () => {
4546
'qsv_set_working_dir',
4647
'qsv_get_working_dir',
4748
'qsv_list_files',
49+
'qsv_log',
4850
'qsv_command',
4951
'qsv_to_parquet',
5052
'qsv_index',

.claude/skills/tests/mcp-server.test.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,14 +51,15 @@ const CORE_TOOLS = [
5151
"qsv_set_working_dir",
5252
"qsv_get_working_dir",
5353
"qsv_list_files",
54+
"qsv_log",
5455
"qsv_command",
5556
"qsv_to_parquet",
5657
"qsv_index",
5758
"qsv_stats",
5859
] as const;
5960

60-
test("CORE_TOOLS has exactly 9 entries", () => {
61-
assert.strictEqual(CORE_TOOLS.length, 9);
61+
test("CORE_TOOLS has exactly 10 entries", () => {
62+
assert.strictEqual(CORE_TOOLS.length, 10);
6263
});
6364

6465
test("CORE_TOOLS all have qsv_ prefix", () => {

0 commit comments

Comments
 (0)