feat(mcp): comprehensive logging (#3555)

jqnatividad · claude · web-flow · commit f0cd3acaa157 · 2026-03-02T05:53:00.000-05:00
* feat(mcp): add qsv_log core tool for agent-initiated reproducibility logging Enable agents to write structured entries (user_prompt, agent_reasoning, agent_action, result_summary, note) to the qsv audit log (qsvmcp.log) with u- prefixed UUIDs, distinct from automatic s-/e- audit entries. Automatic audit logging is skipped for qsv_log calls to avoid recursion. Messages are truncated at 4096 chars and logging failures never break the workflow. Server instructions updated to guide agents on when/how to log for third-party reproducibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address review findings (job 619) All 421 tests pass, 0 failures. The change is correct. Changes: - Fix Unicode truncation fast-path to use UTF-16 length as a cheap guard (strings shorter in UTF-16 are guaranteed shorter in codepoints), only performing expensive `Array.from()` codepoint conversion when the string exceeds the limit Address review findings (job 606) All 72 tests pass (0 failures), including the 3 new tests for missing params. Changes: - Check for `null`/`undefined` params explicitly before string coercion in `handleLogCall`, returning clear "is required" error messages (finding #1) - Trim and strip newlines from log messages before writing, preventing multi-line log entries and inconsistent whitespace (findings #2, #3) - Added tests for missing `entry_type`, missing `message`, and entirely empty params (finding #5) Address review findings (job 607) All 418 tests pass, including the new one. Changes: - Add test for newline-only message (`'\n\n'`) confirming it's rejected as non-empty string Address review findings (job 609) All 74 tests pass (0 failures), including all the new and existing `handleLogCall` tests. Changes: - Log `catch` block now writes error details to stderr via `console.error` instead of silently swallowing - Added `--` separator before the message argument in `qsv log` CLI call to prevent messages starting with `-` from being misinterpreted as flags - Documented newline collapsing behavior in the tool description ("Newlines are collapsed to spaces") - Added test for non-string type coercion (`{ entry_type: 123, message: true }`) confirming `String()` coercion behavior Address review findings (job 610) All 420 tests pass. Changes: - Include truncated error message in the success result returned to the agent (not just stderr), so the agent has actionable context when `qsv_log` write fails - Add test for non-string message coercion with valid `entry_type` to verify `String()` coercion works for the message path Address review findings (job 611) All 420 tests pass. The Rust diagnostics are pre-existing and unrelated to this change. Changes: - Added `assert.ok(!result.isError)` to the `handleLogCall` non-string message coercion test to explicitly verify the result is not an error, making the test intent clearer Address review findings (job 613) All 420 tests pass, 0 failures. The changes are verified. Changes: - Added comment on `--` separator in `handleLogCall` args explaining it guards against messages starting with `-` being parsed as flags (addresses medium finding) - Added `config.qsvValidation.valid` skip guard to `handleLogCall coerces non-string message` test so it properly tests the success path instead of passing accidentally via error swallowing (addresses low finding #4) - Added assertion that success response doesn't contain "warning" to confirm actual success vs swallowed error Address review findings (job 615) No CLAUDE.md changes needed for the `--` removal. All changes are complete and tests pass. Changes: - Remove unnecessary `--` end-of-options sentinel from `qsv log` args — `qsv log` uses docopt variadic `[<message>...]` which handles this correctly, and messages always start with `[entry_type]` so they can never be misinterpreted as flags - Fix Unicode-safe truncation using `Array.from()` instead of `String.slice()` to avoid splitting surrogate pairs in non-ASCII messages - Add throttling guidance to server instructions ("Avoid excessive logging — for simple interactions, a single user_prompt + result_summary pair is enough") - Add test for the `handleLogCall` error-swallowing catch path using a non-existent working directory Address review findings (job 616) The change looks correct. The length check and truncation now both operate on codepoints consistently. Changes: - Fix Unicode truncation length mismatch: use codepoint count (`Array.from(sanitized).length`) for both the gate condition and the truncation, avoiding inconsistency between UTF-16 `.length` and codepoint-aware `Array.from().slice()` Address review findings (job 618) All 421 tests pass, 0 failures. All `handleLogCall` tests pass including the updated write-failure test. Changes: - Reworded catch-path message from misleading `"Logged ... (warning: write failed: ...)"` to clearer `"Log write failed (non-fatal): ... Workflow continues."` (issue 1) - Added fast-path optimization for Unicode truncation: only call `Array.from()` when `sanitized.length > MAX_LOG_MESSAGE_LEN`, avoiding unnecessary codepoint conversion on short messages (issue 3) - Updated test assertions to match the new error message wording * fix(mcp): address Copilot review findings for qsv_log - Move skipAuditLog from "Key Constants" to a behavior note in CLAUDE.md (it's a local variable, not a module-level constant) - Reorder enum and LOG_ENTRY_TYPES Set to match description order (reasoning before action) - Add unique temp dir + cleanup to coercion test to prevent log file accumulation in OS temp root Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
diff --git a/.claude/skills/CLAUDE.md b/.claude/skills/CLAUDE.md
@@ -128,7 +128,7 @@ npm run mcpb:package
 - Auto-enables `--stats-jsonl` for stats command
 - Integrates update checker for background version monitoring
 - **Server instructions**: Provides cross-tool workflow guidance via MCP `initialize` response
-- **Deferred tool loading**: Only 9 core tools loaded initially (~80% token reduction)
+- **Deferred tool loading**: Only 10 core tools loaded initially (~80% token reduction)
 - **Environment-controlled exposure**: Use `QSV_MCP_EXPOSE_ALL_TOOLS=true` for all tools
 - **Roots auto-sync**: `syncWorkingDirFromRoots()` runs on startup and on `RootsListChangedNotification`; manual override via `qsv_set_working_dir`, passing `"auto"` re-enables sync
 
@@ -137,6 +137,8 @@ npm run mcpb:package
 - `SHUTDOWN_TIMEOUT_MS`: 2000 — graceful shutdown timeout (ms)
 - `UPDATE_CHECK_TIMEOUT_MS`: 30_000 — background update check timeout (ms)
 
+**Recursive logging prevention**: The `CallToolRequestSchema` handler sets a local `skipAuditLog` flag for `qsv_log` calls, guarding all three `s-`/`e-` audit log write points to prevent recursive noise.
+
 **Key Functions**:
 ```typescript
 server.setRequestHandler(ListToolsRequestSchema, async () => { ... })
@@ -160,6 +162,7 @@ server.setRequestHandler(ListResourcesRequestSchema, async () => { ... })
 - `COMMAND_GUIDANCE`: `Record<string, CommandGuidance>` — Unified per-command guidance map consolidating when-to-use, common patterns, error prevention, complementary servers, and memory/index/mistake warnings into a single structure
 - `LARGE_FILE_THRESHOLD_BYTES`: 10MB — files larger than this are auto-indexed (replaces `AUTO_INDEX_SIZE_MB`)
 - `MAX_MCP_RESPONSE_SIZE`: 850KB — responses exceeding this are saved to file instead of returned inline
+- `MAX_LOG_MESSAGE_LEN`: 4096 — max characters for `qsv_log` messages (truncated silently)
 
 **Key Exported Functions**:
 - `isBinaryOutputFormat(commandName, params)` - Detect if command output is binary (parquet/arrow/avro)
diff --git a/.claude/skills/src/mcp-server.ts b/.claude/skills/src/mcp-server.ts
@@ -35,9 +35,11 @@ import {
   createListFilesTool,
   createSetWorkingDirTool,
   createGetWorkingDirTool,
+  createLogTool,
   handleConfigTool,
   handleSearchToolsCall,
   handleToParquetCall,
+  handleLogCall,
   initiateShutdown,
   killAllProcesses,
   getActiveProcessCount,
@@ -56,6 +58,7 @@ const CORE_TOOLS = [
   "qsv_set_working_dir",
   "qsv_get_working_dir",
   "qsv_list_files",
+  "qsv_log",
   "qsv_command",
   "qsv_to_parquet",
   "qsv_index",
@@ -90,7 +93,9 @@ CACHE AWARENESS: Before running commands, check for existing caches to save time
 
 MEMORY LIMITS: Commands dedup, sort, reverse, table, transpose, pragmastat load entire files into memory. For files >1GB, prefer extdedup/extsort alternatives via qsv_command. Check column cardinality with qsv_stats before running frequency or pivotp to avoid huge output.
 
-OPERATION TIMEOUT: qsv operations can take significant time, especially on larger files. The MCP server's default operation timeout is 10 minutes (configurable via QSV_MCP_OPERATION_TIMEOUT_MS, max 30 minutes). Do NOT use a shorter client-side timeout — allow operations to run to completion or until the server's configured timeout. Check the current timeout setting with qsv_config. CONCURRENT OPERATIONS: Parallel tool calls are automatically queued. For optimal throughput in Claude Cowork, execute pipeline steps sequentially (index → stats → analysis).`;
+OPERATION TIMEOUT: qsv operations can take significant time, especially on larger files. The MCP server's default operation timeout is 10 minutes (configurable via QSV_MCP_OPERATION_TIMEOUT_MS, max 30 minutes). Do NOT use a shorter client-side timeout — allow operations to run to completion or until the server's configured timeout. Check the current timeout setting with qsv_config. CONCURRENT OPERATIONS: Parallel tool calls are automatically queued. For optimal throughput in Claude Cowork, execute pipeline steps sequentially (index → stats → analysis).
+
+REPRODUCIBILITY LOG: Use qsv_log to create a verifiable audit trail. Log each user prompt (entry_type: "user_prompt") as it arrives, key decisions (entry_type: "agent_reasoning"), actions taken (entry_type: "agent_action"), and outcomes (entry_type: "result_summary"). The log file (qsvmcp.log) records both automatic tool invocations (s-/e- prefixed) and your explicit entries (u- prefixed). Keep entries concise but sufficient for a third party to reproduce your workflow. Avoid excessive logging — for simple interactions, a single user_prompt + result_summary pair is enough. Reserve agent_reasoning for non-obvious decisions.`;
 
 /**
  * Resolved server instructions: uses custom instructions from
@@ -513,6 +518,9 @@ class QsvMcpServer {
         tools.push(createSetWorkingDirTool());
         tools.push(createGetWorkingDirTool());
 
+        // Add logging tool
+        tools.push(createLogTool());
+
         console.error(`[Server] Registered ${tools.length} tools`);
         if (!this.toolsListedOnce) {
           console.error(
@@ -566,7 +574,9 @@ class QsvMcpServer {
       // Start entries only at "info" level; "error" level only logs failures
       // Safe to capture once: config is immutable after initialization
       const auditLogEnabled = config.mcpLogLevel !== "off";
-      if (config.mcpLogLevel === "info") {
+      // Skip automatic audit logging for qsv_log to avoid recursive noise
+      const skipAuditLog = name === "qsv_log";
+      if (config.mcpLogLevel === "info" && !skipAuditLog) {
         runQsvSimple(config.qsvBinPath, ["log", name, `s-${invocationId}`, startMsg], {
           timeoutMs: 5_000,
           cwd: this.filesystemProvider.getWorkingDirectory(),
@@ -581,6 +591,11 @@ class QsvMcpServer {
         // Order: filesystem → generic command → config → search →
         //        to_parquet → skill-based (qsv_*) → unknown tool error.
 
+        // Handle log tool (before filesystem tools so it's fast)
+        if (name === "qsv_log") {
+          return await handleLogCall(toolArgs || {}, this.filesystemProvider.getWorkingDirectory());
+        }
+
         // Handle filesystem tools
         if (name === "qsv_list_files") {
           const directory = typeof toolArgs?.directory === "string" ? toolArgs.directory : undefined;
@@ -705,7 +720,7 @@ class QsvMcpServer {
         // Log end with elapsed time
         const elapsedSecs = ((Date.now() - startTime) / 1000).toFixed(2);
         const isError = "isError" in result && result.isError === true;
-        if (auditLogEnabled && (config.mcpLogLevel === "info" || isError)) {
+        if (auditLogEnabled && !skipAuditLog && (config.mcpLogLevel === "info" || isError)) {
           const endMsg = isError
             ? `error(${elapsedSecs}s): tool returned error`
             : `ok(${elapsedSecs}s)`;
@@ -718,7 +733,7 @@ class QsvMcpServer {
         return result;
       } catch (error: unknown) {
         // Log error end with elapsed time (always log errors unless fully off)
-        if (auditLogEnabled) {
+        if (auditLogEnabled && !skipAuditLog) {
           const elapsedSecs = ((Date.now() - startTime) / 1000).toFixed(2);
           const errMsg = getErrorMessage(error);
           const truncatedErr = errMsg.length > MAX_ARGS_LOG_LEN
diff --git a/.claude/skills/src/mcp-tools.ts b/.claude/skills/src/mcp-tools.ts
@@ -52,6 +52,12 @@ const statOrNull = (path: string) =>
  */
 const AUTO_INDEX_SIZE_MB = 10;
 
+/**
+ * Maximum length for qsv_log messages (in characters).
+ * Messages exceeding this limit are silently truncated.
+ */
+export const MAX_LOG_MESSAGE_LEN = 4096;
+
 /**
  * Commands that always return full CSV data and should use temp files
  */
@@ -3256,3 +3262,128 @@ export async function handleToParquetCall(
     return errorResult(`Error converting CSV to Parquet: ${getErrorMessage(error)}`);
   }
 }
+
+// ============================================================================
+// qsv_log — Agent-initiated reproducibility logging
+// ============================================================================
+
+/** Valid entry types for qsv_log */
+const LOG_ENTRY_TYPES = new Set([
+  "user_prompt",
+  "agent_reasoning",
+  "agent_action",
+  "result_summary",
+  "note",
+]);
+
+/**
+ * Create the qsv_log tool definition.
+ */
+export function createLogTool(): McpToolDefinition {
+  return {
+    name: "qsv_log",
+    description: `Write a structured entry to the qsv audit log (qsvmcp.log) for reproducibility.
+
+💡 USE WHEN:
+- Logging the user's original prompt so a third party can reproduce the session
+- Recording key reasoning or decisions that led to a particular tool choice
+- Summarizing results after a workflow completes
+
+📋 COMMON PATTERN:
+1. Log "user_prompt" when a new user request arrives
+2. Log "agent_reasoning" before complex decisions (e.g., choosing joinp over join)
+3. Log "result_summary" after completing a workflow
+
+📝 ENTRY TYPES:
+- user_prompt — The user's original request (log once per prompt)
+- agent_reasoning — Why you chose a particular approach
+- agent_action — A significant action taken (beyond automatic audit logging)
+- result_summary — Outcome of a completed workflow
+- note — Free-form annotation
+
+⚠️ CAUTION: Keep messages concise. Max ${MAX_LOG_MESSAGE_LEN} chars (truncated silently). Newlines are collapsed to spaces. Logging never fails the workflow.`,
+    inputSchema: {
+      type: "object",
+      properties: {
+        entry_type: {
+          type: "string",
+          enum: ["user_prompt", "agent_reasoning", "agent_action", "result_summary", "note"],
+          description: "Category of log entry.",
+        },
+        message: {
+          type: "string",
+          description: "The log message content.",
+        },
+      },
+      required: ["entry_type", "message"],
+    },
+  };
+}
+
+/**
+ * Handle a qsv_log tool invocation.
+ *
+ * Writes a `u-` prefixed entry to the qsv audit log via `qsv log`.
+ * Logging failures are swallowed — this tool should never break a workflow.
+ */
+export async function handleLogCall(
+  params: Record<string, unknown>,
+  workingDir: string,
+): Promise<{ content: Array<{ type: string; text: string }>; isError?: boolean }> {
+  // Validate required params before coercing
+  if (params.entry_type == null) {
+    return errorResult("entry_type is required.");
+  }
+  if (params.message == null) {
+    return errorResult("message is required.");
+  }
+
+  const entryType = String(params.entry_type);
+  const rawMessage = String(params.message);
+
+  // Validate entry_type
+  if (!LOG_ENTRY_TYPES.has(entryType)) {
+    return errorResult(
+      `Invalid entry_type "${entryType}". Must be one of: ${[...LOG_ENTRY_TYPES].join(", ")}`,
+    );
+  }
+
+  // Validate message
+  if (rawMessage.trim().length === 0) {
+    return errorResult("message must be a non-empty string.");
+  }
+
+  // Trim, strip newlines, and truncate if needed (use Array.from for Unicode-safe truncation)
+  const sanitized = rawMessage.trim().replace(/[\r\n]+/g, " ");
+  // Fast path: if UTF-16 length is within limit, codepoint count is too
+  let message: string;
+  if (sanitized.length <= MAX_LOG_MESSAGE_LEN) {
+    message = sanitized;
+  } else {
+    const codepoints = Array.from(sanitized);
+    message =
+      codepoints.length > MAX_LOG_MESSAGE_LEN
+        ? codepoints.slice(0, MAX_LOG_MESSAGE_LEN).join("")
+        : sanitized;
+  }
+
+  const logId = `u-${randomUUID()}`;
+
+  try {
+    await runQsvSimple(config.qsvBinPath, [
+      "log",
+      "qsv_log",
+      logId,
+      `[${entryType}] ${message}`,
+    ], {
+      timeoutMs: 5_000,
+      cwd: workingDir,
+    });
+  } catch (err) {
+    const errMsg = getErrorMessage(err);
+    console.error(`[qsv_log] write failed: ${errMsg}`);
+    return successResult(`Log write failed (non-fatal): ${errMsg.slice(0, 100)}. Workflow continues.`);
+  }
+
+  return successResult(`Logged ${entryType} entry.`);
+}
diff --git a/.claude/skills/tests/deferred-loading.test.ts b/.claude/skills/tests/deferred-loading.test.ts
@@ -24,6 +24,7 @@ const CORE_TOOLS = [
   "qsv_set_working_dir",
   "qsv_get_working_dir",
   "qsv_list_files",
+  "qsv_log",
   "qsv_command",
   "qsv_to_parquet",
   "qsv_index",
@@ -34,8 +35,8 @@ const CORE_TOOLS = [
 // Core Tools Count Verification
 // ============================================================================
 
-test('CORE_TOOLS has exactly 9 tools', () => {
-  assert.strictEqual(CORE_TOOLS.length, 9, 'Should have exactly 9 core tools');
+test('CORE_TOOLS has exactly 10 tools', () => {
+  assert.strictEqual(CORE_TOOLS.length, 10, 'Should have exactly 10 core tools');
 });
 
 test('CORE_TOOLS includes all required utility tools', () => {
@@ -45,6 +46,7 @@ test('CORE_TOOLS includes all required utility tools', () => {
     'qsv_set_working_dir',
     'qsv_get_working_dir',
     'qsv_list_files',
+    'qsv_log',
     'qsv_command',
     'qsv_to_parquet',
     'qsv_index',
diff --git a/.claude/skills/tests/mcp-server.test.ts b/.claude/skills/tests/mcp-server.test.ts
@@ -51,14 +51,15 @@ const CORE_TOOLS = [
   "qsv_set_working_dir",
   "qsv_get_working_dir",
   "qsv_list_files",
+  "qsv_log",
   "qsv_command",
   "qsv_to_parquet",
   "qsv_index",
   "qsv_stats",
 ] as const;
 
-test("CORE_TOOLS has exactly 9 entries", () => {
-  assert.strictEqual(CORE_TOOLS.length, 9);
+test("CORE_TOOLS has exactly 10 entries", () => {
+  assert.strictEqual(CORE_TOOLS.length, 10);
 });
 
 test("CORE_TOOLS all have qsv_ prefix", () => {
diff --git a/.claude/skills/tests/mcp-tools.test.ts b/.claude/skills/tests/mcp-tools.test.ts