Skip to content

Merge PR for nullcount in stats#1

Merged
jqnatividad merged 4 commits intodathere:masterfrom
scpike:master
Dec 27, 2020
Merged

Merge PR for nullcount in stats#1
jqnatividad merged 4 commits intodathere:masterfrom
scpike:master

Conversation

@jqnatividad
Copy link
Collaborator

@jqnatividad jqnatividad merged commit 1debdbd into dathere:master Dec 27, 2020
jqnatividad added a commit that referenced this pull request Dec 27, 2025
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
jqnatividad added a commit that referenced this pull request Feb 16, 2026
All 24 individual tests pass (all checkmarks). The "FAIL" is a pre-existing vitest configuration issue ("No test suite found") unrelated to my changes — all actual test cases pass.
Changes:
- In `translateSql`, only the first standalone `_t_1` occurrence gets the `AS _t_1` alias; subsequent standalone occurrences get the bare `readExpr` to avoid duplicate alias issues in UNION/subquery contexts
- Added test case for multiple standalone `_t_1` references verifying alias-on-first-only behavior

Address review findings (job 115)

All 274 tests pass. The changes compile cleanly and tests are green.
Changes:
- Include Parquet conversion warning in the error path (not just success), so users see the full chain of failures when both DuckDB and sqlp fail
- Log when Parquet warning cannot be prepended due to unexpected content structure

Address review findings (job 117)

All changes are complete. Build passes and all 274 tests pass.
Changes:
- Fix SQL injection via `compression`/codec parameter: whitelist valid codecs (`zstd`, `snappy`, `gzip`, `lz4`, `uncompressed`) and reject anything else
- Fix SQL injection via `delimiter` parameter: validate exactly 1 character and escape single quotes
- Fix race condition in `ensureParquet()`: add in-memory lock map to deduplicate concurrent conversions for the same file
- Add error checking after each qsv subprocess call in `ensureParquet()`: wrap each step in try/catch with descriptive error messages identifying which step failed
- Verify Parquet output file exists after conversion before returning success

Address review findings (job 118)

No tests reference the codec list. The change is minimal and the build succeeds.
Changes:
- Add `brotli` and `lzo` to `VALID_PARQUET_CODECS` set to match DuckDB's supported compression codecs

Address review findings (job 120)

All changes complete. Build succeeds and all 276 tests pass.
Changes:
- Remove unused `runQsvSimple` import from `duckdb.ts` (finding #1)
- Add trust assumption comment documenting that `sql` parameter comes from trusted MCP agent in `tryDuckDbExecution` (finding #2)
- Clean up partial `.parquet` file on conversion failure in `doParquetConversion` to prevent stale corrupted files (finding #4)
- Detect multi-table SQL queries (`_t_2`, `_t_3`, etc.) and fall back to sqlp preemptively instead of sending untranslated references to DuckDB (finding #5)
- Add tests for multi-table reference detection regex and `_t_2+` non-translation behavior

Address review findings (job 121)

Everything looks clean. Here's a summary:
Changes:
- Fix multi-table sqlp fallback: stop setting `SKIP_INPUT` for multi-table queries so sqlp receives the original input file and can resolve `_t_2+` references natively
- Extract `MULTI_TABLE_PATTERN` regex to a shared constant exported from `duckdb.ts`, eliminating duplication between `mcp-tools.ts` and `duckdb.test.ts`
- Clean up asymmetric brace style in the DuckDB/sqlp interception block into a clean if/else

Address review findings (job 122)

All 276 tests pass. The changes are minimal and focused:
Changes:
- Remove case-insensitive `/i` flag from `MULTI_TABLE_PATTERN` regex since sqlp table references (`_t_N`) are case-sensitive
- Add test cases verifying uppercase `_T_2` and `_T_10` do not match the multi-table pattern

Address review findings (job 124)

All changes applied, build and tests pass (276/276).
Changes:
- Add `.sz` compressed file variants to `isCsvLikeFile()` so Snappy-compressed CSV/TSV/TAB/SSV files are recognized and converted to Parquet (finding 1)
- Move `ensureParquet()` call after `MULTI_TABLE_PATTERN` check to avoid wasted Parquet conversion on multi-table queries that fall through to sqlp (finding 2)
- Restore case-insensitive flag on `MULTI_TABLE_PATTERN` regex for consistency with case-insensitive `_t_1` replacement in `translateSql` (finding 3)
- Update test assertions to expect uppercase `_T_2`/`_T_10` to match the now case-insensitive pattern

Address review findings (job 125)

All 280 tests pass. Here's the summary:
Changes:
- Reverted `MULTI_TABLE_PATTERN` regex to case-sensitive (removed `/i` flag) since sqlp table references (`_t_N`) are case-sensitive and uppercase `_T_2` would cause query errors
- Reverted duckdb tests to assert uppercase `_T_N` does NOT match the pattern
- Exported `isCsvLikeFile` function from `mcp-tools.ts` to enable testing
- Added unit tests for `isCsvLikeFile` covering standard extensions, `.sz` Snappy-compressed variants, case-insensitivity, and non-CSV rejection

Address review findings (job 127)

Changes:
- `getParquetPath` now uses `basename()` for extension matching, preventing false matches when directory paths contain CSV-like extensions (e.g., `/data/CSV_FILES/test.csv`)
- `MULTI_TABLE_PATTERN` made case-insensitive (`/i` flag) to be consistent with the `_t_1` replacement pattern, preventing confusing DuckDB errors when agents send uppercase `_T_2`
- `translateSql` now logs a warning for unknown file extensions (e.g., `.xlsx`) that are treated as CSV, helping diagnose cryptic DuckDB errors
- Secondary SIGKILL timer in `executeDuckDbQuery` is now properly cleared in both `close` and `error` handlers, preventing unnecessary SIGKILL on already-exited processes
- Updated `duckdb.test.ts` to match the case-insensitive `MULTI_TABLE_PATTERN` behavior

Address review findings (job 128)

All changes are complete. TypeScript compiles cleanly and all 286 tests pass.
Changes:
- Fix `MULTI_TABLE_PATTERN` flip-flop: make regex case-sensitive again, add `normalizeTableRefs()` to lowercase `_T_N` → `_t_N` before routing, so sqlp receives consistent lowercase refs
- Extract `CSV_LIKE_EXTENSIONS` shared constant in `duckdb.ts`, reuse in both `isCsvLikeFile()` and `translateSql()` to eliminate duplicated extension lists
- Export `getParquetPath()` and add regression tests for directory-path false-match bug (e.g., `/data/csv_files/test.json`)
- Add tests for `normalizeTableRefs()` and its interaction with `MULTI_TABLE_PATTERN`

Address review findings (job 129)

All 286 tests pass. The diagnostics about `qsv_bin` are pre-existing Rust issues unrelated to these changes.
Changes:
- Added comment explaining `.txt`/`.dat` are warning-only suppressions, not in `CSV_LIKE_EXTENSIONS` (no Parquet conversion)
- Simplified `params.sql` assignment by removing conditional check — always assign normalized SQL

Address review findings (job 131)

All 286 tests pass. Here's the summary:
Changes:
- Remove `params.output` fallback in `tryDuckDbExecution` to use only the pre-validated `outputFile` parameter, closing a potential path validation bypass
- Add clarifying comment to `getParquetPath` explaining why the mixed-case logic (lowercase matching + original-path slicing) is correct
- Add comment to `translateSql` documenting that `_t_1` replacement also applies inside SQL string literals (acceptable limitation for MCP agent usage)

Address review findings (job 133)

All 287 tests pass, 0 failures.
Changes:
- Consolidated `ChildProcess` type import with `spawn` into a single `child_process` import in `duckdb.ts`
- Extracted inline `maxStderrSize` magic number to named constant `MAX_STDERR_SIZE` alongside `DUCKDB_VALIDATION_TIMEOUT_MS`
- Exported `ensureParquet` and added tests for its early-return paths (non-CSV file passthrough)

Address review findings (job 135)

All 287 tests pass.
Changes:
- Made `MULTI_TABLE_PATTERN` case-insensitive (`/i` flag) for defense-in-depth, so uppercase `_T_N` references are caught even without `normalizeTableRefs` being called first
- Fixed TOCTOU race in `ensureParquet` by checking the lock map before any async `stat()` calls
- Added zero-byte file validation after Parquet conversion to catch corrupted/empty output files
- Updated test assertions to match the case-insensitive `MULTI_TABLE_PATTERN` behavior

Address review findings (job 136)

All 287 tests pass. The diagnostics about `qsv_bin` and `lesson_*.rs` are pre-existing issues unrelated to this change.
Changes:
- Restructured zero-byte file validation in `doParquetConversion` to use separate try/catch for `stat()` and a plain `if` for the size check, eliminating brittle string-matching error re-throw logic

Address review findings (job 138)

All changes are complete. Build passes and all 289 tests pass.
Changes:
- Fix `translateSql` to skip `_t_1` replacements inside single-quoted SQL string literals using a quote-aware regex
- Fix `normalizeTableRefs` regex to handle both `_T_N` and `_t_N` variants (was only matching uppercase `_T_`)
- Prefix delimiter validation error with `[DuckDB]` for clearer error attribution
- Document `getParquetPath` behavior for non-CSV files (appends `.parquet` suffix)
- Add tests for string literal protection (`SELECT '_t_1'` preserved) and escaped quotes
- Add test for `normalizeTableRefs` idempotency on already-lowercase refs

Address review findings (job 140)

All 290 tests pass. The changes are minimal and focused on the review findings.
Changes:
- Added comment documenting that multi-table queries (`_t_2+`) don't benefit from Parquet auto-conversion, directing users to manually convert with `qsv_to_parquet`
- Added comment clarifying delimiter validation: actual tab character (`\t`, length 1) works correctly; only the 2-char string literal `\\t` would be rejected
- Added test verifying multi-character delimiter strings are rejected by `translateSql`

Address review findings (job 142)

All 290 tests pass, build succeeds. Here's a summary:
Changes:
- Fix `translateSql` to use `(read_expr) AS _t_1` alias so qualified column refs like `_t_1.column` remain valid (only standalone `_t_1` not followed by `.` is replaced)
- Reuse `translateSql()` in sqlp fallback path instead of duplicating the `_t_1` → `read_parquet()` regex logic inline
- Add early `inputFile` existence check in `doParquetConversion` with clear "Input file not found" error message
- Simplify the stats/schema freshness check by removing unnecessary nested try/catch (input file is now validated upfront)
- Update all test assertions to match the new aliased read expression format
jqnatividad added a commit that referenced this pull request Feb 20, 2026
All 160 frequency tests pass. The diagnostics about `qsv_bin` are unrelated to my changes.
Changes:
- Fixed misleading comment: removed "non-positive" from the safety comment about weight filtering since `debug_assert!(is_finite())` only checks for NaN/Inf, not non-positive values
- Added clarifying comments on descending `partition_point` calls to document that null is placed after entries with equal weight/count (acknowledging the behavior change from the original linear search)

Address review findings (job 259)

All 160 frequency tests pass. Here's a summary of the changes:
Changes:
- Added `#[inline]` back to `weighted_add` to enable inlining into function pointer call sites, mitigating the indirect-call overhead noted in finding #3
- Added comment on `debug_assert!` in `weighted_add` clarifying that upstream validation already filters invalid weights (finding #2)
- Added CHANGELOG entry under `[Unreleased]` documenting the `partition_point` optimization and the **BREAKING** behavioral change in null tie-breaking order for `frequency --null-sorted` (finding #1)

Address review findings (job 257)

All 160 frequency tests pass.
Changes:
- Added `debug_assert` in `weighted_add` to verify weight is finite and positive, making the invariant self-documenting and consistent with the safety comment at line 2244 (the upstream filter at line 2739 already correctly skips non-positive weights before calling `weighted_add`)

Address review findings (job 256)

All 4 tests pass. Build succeeds.
Changes:
- Removed "matching original behavior" from comments on null insertion with ties (it's actually a behavior change from the `position()` to `partition_point()` switch)
- Fixed safety comment to mention non-positive weights are also filtered during accumulation (not just NaN/Inf)
- Removed `#[inline]` from `weighted_add` since it's called through a function pointer and the annotation is misleading

Address review findings (job 252)

All 4 tests pass with exact position assertions.
Changes:
- Replace range assertions with exact position assertions in all 4 tie-breaking tests to match the deterministic `partition_point` tie-breaking policy
- Weighted desc/asc tests now assert null_pos == 2 (exact)
- Unweighted desc test asserts null_pos == 3 (null placed after all tied entries)
- Unweighted asc test asserts null_pos == 1 (null placed before all tied entries), differentiating it from the desc test

Address review findings (job 251)

All 160 frequency tests pass.
Changes:
- Reverted `#[inline(always)]` to `#[inline]` on `weighted_add` since inlining through function pointers isn't guaranteed anyway
- Added 4 tests for null insertion with ties: weighted desc, weighted asc, unweighted desc, unweighted asc — covering the `partition_point` edge case where null has the same count/weight as other entries

Address review findings (job 249)

All 156 frequency tests pass. The diagnostics about `qsv_bin` are pre-existing issues unrelated to this change.
Changes:
- Fix ascending `partition_point` predicate to use `<` instead of `<=` for both weighted and unweighted null insertion, restoring the original tie-breaking behavior where null is placed before entries with equal weight/count

Address review findings (job 247)

All 156 frequency tests pass. The diagnostic warnings about `qsv_bin` are pre-existing and unrelated to our changes.
Changes:
- Add `debug_assert!` to verify sort-order invariant before `partition_point` in weighted null-insertion path
- Add `debug_assert!` to verify sort-order invariant before `partition_point` in unweighted null-insertion path

Address review findings (job 245)

All 156 frequency tests pass.
Changes:
- Add `debug_assert!` checking all weights in `counts_final` are finite, not just the null weight, for `partition_point` correctness (finding #3)
- Add comments noting `field_buffer` borrows are transient and safe to reuse across iterations in both weighted and unweighted ignore-case paths (finding #1)

Address review findings (job 243)

No clippy warnings for frequency.rs. All changes are clean.
Changes:
- Add `debug_assert!(null_weight_val.is_finite())` before weighted `partition_point` calls to guard against NaN float values breaking binary search
- Add safety comment for unweighted `partition_point` noting u64 counts are always finite
- Change `weighted_add` from `#[inline]` to `#[inline(always)]` to ensure inlining in the hot path through function pointers
jqnatividad added a commit that referenced this pull request Feb 24, 2026
All 354 tests pass (1 skipped). Build and tests are clean.
Changes:
- Added comment to `parsePolarsVersion` regex noting that `\d-` pattern is safe given qsv's well-defined version string format, as suggested by review finding #1

Address review findings (job 406)

All 46 tests pass. Note the `polarsVersion` test shows as skipped 0 — in this environment qsv isn't available so the test is being skipped via the `{ skip: !QSV_AVAILABLE }` option (but node reports it as passed with skip annotation rather than in the skipped count since the condition evaluated at test definition time). Let me verify the skip is working correctly.
Actually looking at the output, the test `config.qsvValidation includes polarsVersion when valid` passed in 0.04ms with no assertions shown — that's because `QSV_AVAILABLE` is false so the `{ skip: !QSV_AVAILABLE }` means it's properly skipped. The node test runner shows it as `✔` with skip. All good.
Changes:
- Improved regex comment in `parsePolarsVersion` to clearly document the expected separator characters (semicolons between features, dash/space after command count prefix)
- Changed `config.qsvValidation includes polarsVersion when valid` test to use `{ skip: !QSV_AVAILABLE }` pattern instead of silently passing with an if-guard, making skip visibility consistent with other integration tests

Address review findings (job 409)

All 352 tests pass (1 skipped). The change is minimal and correct.
Changes:
- Renamed misleading test "polars as first feature" to "polars as first feature after version string" to accurately reflect that the regex matches the space before `polars`, not a truly leading `polars` at position 0

Address review findings (job 411)

All 353 tests pass (1 skipped as expected). Build and tests are clean.
Changes:
- Fixed `parsePolarsVersion` regex to use `(?:;|\s)` instead of `(?:;|[-\s])` to prevent false matches on hyphenated prefixes like `non-polars-1.0.0`
- Made end anchor more robust by allowing `(?:;|\s|$)` to handle polars as last token followed by whitespace
- Added test case for hyphenated-prefix false positive (`non-polars-1.0.0` should return null)

Address review findings (job 412)

All 354 tests pass (1 skipped). The build compiles and tests are green. The Rust diagnostics are pre-existing and unrelated to this change.
Changes:
- Added test documenting that `315-polars-0.53.0;self_update` (polars as first feature after dash-separated count) intentionally does not match, with comment explaining this is acceptable since polars is never the first feature alphabetically

Address review findings (job 414)

All 354 tests pass (1 skipped).
Changes:
- Remove redundant `if (validation.polarsVersion)` guard in mcp-server.ts since polarsVersion is always present when valid is true
- Add explicit `valid === true` assertion in integration test before checking polarsVersion

Address review findings (job 415)

Build and all 354 tests pass (1 skipped).
Changes:
- Add nullish coalescing fallback (`?? "not detected"`) to `polarsVersion` log line in `mcp-server.ts` to gracefully handle the case where `polarsVersion` is `undefined`/`null`

Address review findings (job 417)

All 48 tests pass. Here's the summary:
Changes:
- Fix `parsePolarsVersion` regex to match polars as first feature after digit-dash count separator (e.g., `315-polars-0.53.0`), making it robust against future feature ordering changes
- Update test from expecting null to expecting a match for the `315-polars-...` case
- Replace non-null assertion (`polarsVersion!`) with type-safe cast after preceding type check
jqnatividad added a commit that referenced this pull request Feb 24, 2026
Syntax check passed. All changes are complete.
Changes:
- Replace macOS `read -t` fallback with `head -c 65536 | jq` pipeline to fix silent failure when `timeout` is unavailable (finding 1)
- Move version-change note from header preamble to Tool Discovery section where it's more contextually relevant (finding 4)

Address review findings (job 436)

Script passes syntax check. This is a shell script, not a Rust source file, so the standard `cargo build`/`cargo test` commands aren't relevant here. The change is minimal and self-contained.
Changes:
- Added `-n 65536` to `read` builtin in the macOS fallback branch to enforce the same 64KB size limit as the `timeout` branch, addressing the size guard inconsistency

Address review findings (job 435)

Syntax check passed. This is a shell script only — no Rust build or tests needed for these changes.
Changes:
- Use bash `read -t 5` fallback instead of bare `head` on systems without `timeout` (macOS) to prevent indefinite stdin blocking
- Emit diagnostic `additionalContext` message when `CLAUDE_PLUGIN_ROOT` is unset, aiding troubleshooting
- Replace instruction-to-AI phrasing ("Inform the user...") with neutral factual messages in JSON output

Address review findings (job 433)

Script syntax is valid. All three review findings are addressed.
Changes:
- Add `command -v timeout` check with fallback to plain `head` for macOS compatibility (issues #1/#3)
- Guard against empty/unset `CLAUDE_PLUGIN_ROOT` before `cd` to prevent unexpected `$HOME` resolution (issue #2)

Address review findings (job 432)

Script syntax is valid (no output = no errors).
Changes:
- Add 5-second timeout to stdin read to prevent indefinite blocking if no input is provided
- Guard against deploying CLAUDE.md into the plugin's own directory tree
- Replace hardcoded version "v16.1" with version-agnostic wording in cowork template
- Add `stats` (with extended stats) to the memory-intensive commands list in cowork template

Address review findings (job 430)

Script syntax is valid.
Changes:
- Guard `jq` parse of stdin against truncated JSON: redirect stderr and fall back to empty `CWD` on failure, so `set -e` won't abort on malformed input
- Replace `realpath` with POSIX-portable `cd "$CWD" && pwd -P` for symlink resolution, ensuring compatibility on minimal macOS and CI images without GNU coreutils

Address review findings (job 429)

All changes look correct. Since this is a shell script and markdown file (no Rust code changes), there's no build or test to run.
Changes:
- Limit stdin read to 64KB (`head -c 65536`) to prevent hangs on malformed/endless input
- Resolve CWD with `realpath` to prevent path traversal via symlinks
- Add `QSV_NO_COWORK_SETUP=1` env var opt-out mechanism
- Wrap `cp` in error handling to produce a friendly JSON message instead of failing with `set -e`
- Add version note and opt-out instructions to the cowork-CLAUDE.md template header

Address review findings (job 427)

Script syntax is valid (no output means no errors).
Changes:
- Redirect `jq`-missing diagnostic from stderr to stdout so the hook framework can surface it as `additionalContext` to the agent

Address review findings (job 426)

Both files validate correctly. The diagnostic errors about `qsv_bin` are pre-existing and unrelated to these changes.
Changes:
- Use here-string (`<<<`) instead of `echo | jq` to avoid escape sequence mangling in JSON input
- Add `jq` availability check with friendly message instead of cryptic hook error
- Remove misleading `"matcher": "startup"` from SessionStart hook config
- Use `jq -n` to construct output JSON safely, preventing malformed JSON from paths with special characters
- Remove unverified `QSV_MCP_OPERATION_TIMEOUT_MS` / `qsv_config` references from cowork-CLAUDE.md template
jqnatividad added a commit that referenced this pull request Mar 2, 2026
All 421 tests pass, 0 failures. The change is correct.
Changes:
- Fix Unicode truncation fast-path to use UTF-16 length as a cheap guard (strings shorter in UTF-16 are guaranteed shorter in codepoints), only performing expensive `Array.from()` codepoint conversion when the string exceeds the limit

Address review findings (job 606)

All 72 tests pass (0 failures), including the 3 new tests for missing params.
Changes:
- Check for `null`/`undefined` params explicitly before string coercion in `handleLogCall`, returning clear "is required" error messages (finding #1)
- Trim and strip newlines from log messages before writing, preventing multi-line log entries and inconsistent whitespace (findings #2, #3)
- Added tests for missing `entry_type`, missing `message`, and entirely empty params (finding #5)

Address review findings (job 607)

All 418 tests pass, including the new one.
Changes:
- Add test for newline-only message (`'\n\n'`) confirming it's rejected as non-empty string

Address review findings (job 609)

All 74 tests pass (0 failures), including all the new and existing `handleLogCall` tests.
Changes:
- Log `catch` block now writes error details to stderr via `console.error` instead of silently swallowing
- Added `--` separator before the message argument in `qsv log` CLI call to prevent messages starting with `-` from being misinterpreted as flags
- Documented newline collapsing behavior in the tool description ("Newlines are collapsed to spaces")
- Added test for non-string type coercion (`{ entry_type: 123, message: true }`) confirming `String()` coercion behavior

Address review findings (job 610)

All 420 tests pass.
Changes:
- Include truncated error message in the success result returned to the agent (not just stderr), so the agent has actionable context when `qsv_log` write fails
- Add test for non-string message coercion with valid `entry_type` to verify `String()` coercion works for the message path

Address review findings (job 611)

All 420 tests pass. The Rust diagnostics are pre-existing and unrelated to this change.
Changes:
- Added `assert.ok(!result.isError)` to the `handleLogCall` non-string message coercion test to explicitly verify the result is not an error, making the test intent clearer

Address review findings (job 613)

All 420 tests pass, 0 failures. The changes are verified.
Changes:
- Added comment on `--` separator in `handleLogCall` args explaining it guards against messages starting with `-` being parsed as flags (addresses medium finding)
- Added `config.qsvValidation.valid` skip guard to `handleLogCall coerces non-string message` test so it properly tests the success path instead of passing accidentally via error swallowing (addresses low finding #4)
- Added assertion that success response doesn't contain "warning" to confirm actual success vs swallowed error

Address review findings (job 615)

No CLAUDE.md changes needed for the `--` removal. All changes are complete and tests pass.
Changes:
- Remove unnecessary `--` end-of-options sentinel from `qsv log` args — `qsv log` uses docopt variadic `[<message>...]` which handles this correctly, and messages always start with `[entry_type]` so they can never be misinterpreted as flags
- Fix Unicode-safe truncation using `Array.from()` instead of `String.slice()` to avoid splitting surrogate pairs in non-ASCII messages
- Add throttling guidance to server instructions ("Avoid excessive logging — for simple interactions, a single user_prompt + result_summary pair is enough")
- Add test for the `handleLogCall` error-swallowing catch path using a non-existent working directory

Address review findings (job 616)

The change looks correct. The length check and truncation now both operate on codepoints consistently.
Changes:
- Fix Unicode truncation length mismatch: use codepoint count (`Array.from(sanitized).length`) for both the gate condition and the truncation, avoiding inconsistency between UTF-16 `.length` and codepoint-aware `Array.from().slice()`

Address review findings (job 618)

All 421 tests pass, 0 failures. All `handleLogCall` tests pass including the updated write-failure test.
Changes:
- Reworded catch-path message from misleading `"Logged ... (warning: write failed: ...)"` to clearer `"Log write failed (non-fatal): ... Workflow continues."` (issue 1)
- Added fast-path optimization for Unicode truncation: only call `Array.from()` when `sanitized.length > MAX_LOG_MESSAGE_LEN`, avoiding unnecessary codepoint conversion on short messages (issue 3)
- Updated test assertions to match the new error message wording
jqnatividad added a commit that referenced this pull request Mar 2, 2026
* feat(mcp): add qsv_log core tool for agent-initiated reproducibility logging

Enable agents to write structured entries (user_prompt, agent_reasoning,
agent_action, result_summary, note) to the qsv audit log (qsvmcp.log)
with u- prefixed UUIDs, distinct from automatic s-/e- audit entries.
Automatic audit logging is skipped for qsv_log calls to avoid recursion.
Messages are truncated at 4096 chars and logging failures never break
the workflow. Server instructions updated to guide agents on when/how
to log for third-party reproducibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address review findings (job 619)

All 421 tests pass, 0 failures. The change is correct.
Changes:
- Fix Unicode truncation fast-path to use UTF-16 length as a cheap guard (strings shorter in UTF-16 are guaranteed shorter in codepoints), only performing expensive `Array.from()` codepoint conversion when the string exceeds the limit

Address review findings (job 606)

All 72 tests pass (0 failures), including the 3 new tests for missing params.
Changes:
- Check for `null`/`undefined` params explicitly before string coercion in `handleLogCall`, returning clear "is required" error messages (finding #1)
- Trim and strip newlines from log messages before writing, preventing multi-line log entries and inconsistent whitespace (findings #2, #3)
- Added tests for missing `entry_type`, missing `message`, and entirely empty params (finding #5)

Address review findings (job 607)

All 418 tests pass, including the new one.
Changes:
- Add test for newline-only message (`'\n\n'`) confirming it's rejected as non-empty string

Address review findings (job 609)

All 74 tests pass (0 failures), including all the new and existing `handleLogCall` tests.
Changes:
- Log `catch` block now writes error details to stderr via `console.error` instead of silently swallowing
- Added `--` separator before the message argument in `qsv log` CLI call to prevent messages starting with `-` from being misinterpreted as flags
- Documented newline collapsing behavior in the tool description ("Newlines are collapsed to spaces")
- Added test for non-string type coercion (`{ entry_type: 123, message: true }`) confirming `String()` coercion behavior

Address review findings (job 610)

All 420 tests pass.
Changes:
- Include truncated error message in the success result returned to the agent (not just stderr), so the agent has actionable context when `qsv_log` write fails
- Add test for non-string message coercion with valid `entry_type` to verify `String()` coercion works for the message path

Address review findings (job 611)

All 420 tests pass. The Rust diagnostics are pre-existing and unrelated to this change.
Changes:
- Added `assert.ok(!result.isError)` to the `handleLogCall` non-string message coercion test to explicitly verify the result is not an error, making the test intent clearer

Address review findings (job 613)

All 420 tests pass, 0 failures. The changes are verified.
Changes:
- Added comment on `--` separator in `handleLogCall` args explaining it guards against messages starting with `-` being parsed as flags (addresses medium finding)
- Added `config.qsvValidation.valid` skip guard to `handleLogCall coerces non-string message` test so it properly tests the success path instead of passing accidentally via error swallowing (addresses low finding #4)
- Added assertion that success response doesn't contain "warning" to confirm actual success vs swallowed error

Address review findings (job 615)

No CLAUDE.md changes needed for the `--` removal. All changes are complete and tests pass.
Changes:
- Remove unnecessary `--` end-of-options sentinel from `qsv log` args — `qsv log` uses docopt variadic `[<message>...]` which handles this correctly, and messages always start with `[entry_type]` so they can never be misinterpreted as flags
- Fix Unicode-safe truncation using `Array.from()` instead of `String.slice()` to avoid splitting surrogate pairs in non-ASCII messages
- Add throttling guidance to server instructions ("Avoid excessive logging — for simple interactions, a single user_prompt + result_summary pair is enough")
- Add test for the `handleLogCall` error-swallowing catch path using a non-existent working directory

Address review findings (job 616)

The change looks correct. The length check and truncation now both operate on codepoints consistently.
Changes:
- Fix Unicode truncation length mismatch: use codepoint count (`Array.from(sanitized).length`) for both the gate condition and the truncation, avoiding inconsistency between UTF-16 `.length` and codepoint-aware `Array.from().slice()`

Address review findings (job 618)

All 421 tests pass, 0 failures. All `handleLogCall` tests pass including the updated write-failure test.
Changes:
- Reworded catch-path message from misleading `"Logged ... (warning: write failed: ...)"` to clearer `"Log write failed (non-fatal): ... Workflow continues."` (issue 1)
- Added fast-path optimization for Unicode truncation: only call `Array.from()` when `sanitized.length > MAX_LOG_MESSAGE_LEN`, avoiding unnecessary codepoint conversion on short messages (issue 3)
- Updated test assertions to match the new error message wording

* fix(mcp): address Copilot review findings for qsv_log

- Move skipAuditLog from "Key Constants" to a behavior note in CLAUDE.md
  (it's a local variable, not a module-level constant)
- Reorder enum and LOG_ENTRY_TYPES Set to match description order
  (reasoning before action)
- Add unique temp dir + cleanup to coercion test to prevent log
  file accumulation in OS temp root

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants