Skip to content

fix(scripts): preserve manual NOTES block in incremental report generator#1119

Open
carlos-alm wants to merge 6 commits into
mainfrom
fix/incremental-report-preserve-notes
Open

fix(scripts): preserve manual NOTES block in incremental report generator#1119
carlos-alm wants to merge 6 commits into
mainfrom
fix/incremental-report-preserve-notes

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • scripts/update-incremental-report.ts regenerates generated/benchmarks/INCREMENTAL-BENCHMARKS.md from scratch on every release, which silently dropped the manual <!-- NOTES_START --> ... <!-- NOTES_END --> block PR docs: update incremental benchmarks (3.9.5) #1008 added to explain why 3.9.5 has null build metrics. After 3.9.6, the version-history table jumped 3.9.4 → 3.9.6 with no context.
  • Extract any NOTES_START/NOTES_END block from the existing file (mirroring how INCREMENTAL_BENCHMARK_DATA is preserved) and re-emit it just before the data comment in the regenerated markdown.
  • Restore the lost 3.9.5 note that was dropped in PR docs: update incremental benchmarks (3.9.6) #1032.
  • Add a regression test under tests/unit/update-incremental-report.test.ts, gated on a new CODEGRAPH_INCREMENTAL_REPORT_PATH env override so the script can target a temp file from tests.

Closes #1042

Test plan

  • npx vitest run tests/unit/update-incremental-report.test.ts — both cases pass (NOTES preserved on roundtrip, no spurious NOTES when absent)
  • npm run lint — no new errors
  • CI green

…ator

scripts/update-incremental-report.ts regenerates the report from scratch
on every release. PR #1008 added a manual <!-- NOTES_START --> ...
<!-- NOTES_END --> block explaining why 3.9.5 had null build metrics,
but the 3.9.6 release silently dropped it because the script had no
awareness of the sentinel pair. The version-history table now jumps
3.9.4 → 3.9.6 with no explanation.

Extract any NOTES block from the existing file (mirroring how
INCREMENTAL_BENCHMARK_DATA is preserved) and re-emit it just before
the data comment in the regenerated markdown. Restore the lost 3.9.5
note and add a regression test gated on the env-overridable report path.

Closes #1042
@claude
Copy link
Copy Markdown

claude Bot commented May 14, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Incremental Report NOTES Preservation

Root Cause Analysis

Problem Being Solved: The scripts/update-incremental-report.ts script was regenerating INCREMENTAL-BENCHMARKS.md from scratch on every release, silently dropping manual NOTES blocks that explained contextual information (specifically, why version 3.9.5 had null build metrics due to timeout issues). This caused loss of important historical context.

Why This Approach: The fix properly mirrors the existing pattern used for preserving INCREMENTAL_BENCHMARK_DATA - it extracts any existing NOTES_START/NOTES_END block from the file and re-emits it during regeneration. This is a minimal, surgical fix that preserves manual annotations without requiring architectural changes.

Risk Assessment: LOW RISK - This is a targeted fix to a specific data loss issue. The changes are additive and follow existing patterns.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies added
  • Foundation-aligned: ✓ — Preserves data integrity, improves tool reliability
  • Problem-fit: 3 — Prevents data loss that affects documentation quality, indirectly helps maintain context for future development
  • Breaking: No — Purely additive functionality, no behavior changes to existing features
  • Tier: 1

Code Quality Assessment

Changes Analysis:

scripts/update-incremental-report.ts (lines 36-48, 163):

  • Good: Follows the exact same pattern as existing INCREMENTAL_BENCHMARK_DATA extraction
  • Good: Clean regex pattern matches existing style: /[\s\S]*?/
  • Good: Proper placement logic - NOTES appear before data comment as intended
  • Good: Environment override via CODEGRAPH_INCREMENTAL_REPORT_PATH enables testability

generated/benchmarks/INCREMENTAL-BENCHMARKS.md (lines 91-93):

tests/unit/update-incremental-report.test.ts (new file):

  • Good: Comprehensive test coverage for both positive and negative cases
  • Good: Proper test isolation with temp directories
  • Good: Tests roundtrip preservation and verifies no spurious NOTES creation
  • Good: Uses environment override properly for testing
  • Good: Node.js compatibility flag usage (--experimental-strip-types)

Security & Performance:

  • Security: No injection vulnerabilities - uses safe regex and string operations
  • Performance: Minimal impact - single regex match added to existing file processing
  • Reliability: Graceful failure - if regex doesn't match, simply no notes preserved (fail-safe)

Minor Observations:

  1. Line 48: The regex correctly handles multiline content with [\s\S]*? (non-greedy)
  2. Line 163: Conditional emission if (notesBlock) md += ... properly handles empty case
  3. The test uses a practical approach with execFileSync to test the actual script behavior

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a well-executed fix for a real data loss issue. The implementation follows existing patterns, includes comprehensive tests, and has minimal risk. The approach correctly preserves manual documentation that was being lost during regeneration. Only minor deduction for being somewhat niche functionality, but it's exactly the right solution for the problem at hand.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR restores and preserves the manual <!-- NOTES_START --> ... <!-- NOTES_END --> annotation blocks in the incremental benchmark report, which were silently dropped on each regeneration. It also adds a CODEGRAPH_INCREMENTAL_REPORT_PATH env override so unit tests can target a temp file, and ships three regression tests covering single-block, multi-block, and no-block cases.

  • scripts/update-incremental-report.ts: Uses matchAll() (not match()) to collect all NOTES blocks from the existing file and re-emits them, joined with blank lines, just before the embedded JSON comment on each regeneration.
  • generated/benchmarks/INCREMENTAL-BENCHMARKS.md: Restores the 3.9.5 build-timeout annotation that was lost in PR docs: update incremental benchmarks (3.9.6) #1032.
  • tests/unit/update-incremental-report.test.ts: New unit tests validate NOTES preservation (single and multiple blocks) and the no-NOTES case via the new env-var path override.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped to extracting and re-emitting NOTES blocks, with no effect on the benchmark data or history logic.

The NOTES extraction uses a well-bounded non-greedy regex with the g flag (required for matchAll), joins all matches in document order, and guards the emit with a truthiness check so nothing is added when no blocks are present. The existing benchmark-data path is untouched. Three targeted unit tests exercise the added behaviour end-to-end via the real script subprocess.

No files require special attention.

Important Files Changed

Filename Overview
scripts/update-incremental-report.ts Adds NOTES-block preservation via matchAll() and a CODEGRAPH_INCREMENTAL_REPORT_PATH env override; logic is correct and mirrors the existing benchmark-data extraction pattern.
tests/unit/update-incremental-report.test.ts New test file covers single-block, multi-block, and absent-NOTES cases; uses beforeEach/afterEach correctly for temp-dir cleanup.
generated/benchmarks/INCREMENTAL-BENCHMARKS.md Re-adds the dropped 3.9.5 build-timeout annotation; content is accurate and matches the existing note described in the PR.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Read benchmark JSON\nfrom file arg or stdin] --> B[Resolve reportPath\nenv override or default]
    B --> C{File exists?}
    C -- No --> E[history = empty]
    C -- Yes --> D[Read existing file]
    D --> D1[Extract INCREMENTAL_BENCHMARK_DATA\nJSON via match]
    D --> D2[Extract all NOTES blocks\nvia matchAll]
    D1 --> E
    D2 --> F[notesBlock = blocks joined with blank lines]
    E --> G[Prepend new entry to history]
    F --> G
    G --> H[Build markdown:\nheader + version table\n+ latest summary]
    H --> I{notesBlock non-empty?}
    I -- Yes --> J[Append notesBlock]
    I -- No --> K
    J --> K[Append INCREMENTAL_BENCHMARK_DATA comment]
    K --> L[Write reportPath]
    L --> M[Run regression detection]
Loading

Reviews (7): Last reviewed commit: "Merge branch 'main' into fix/incremental..." | Re-trigger Greptile

Comment thread scripts/update-incremental-report.ts Outdated
Comment on lines +47 to +48
const notesMatch = content.match(/<!--\s*NOTES_START\s*-->[\s\S]*?<!--\s*NOTES_END\s*-->/);
if (notesMatch) notesBlock = notesMatch[0];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Only first NOTES block is silently preserved

.match() returns the first regex match, so if a second <!-- NOTES_START --> ... <!-- NOTES_END --> block is ever added (e.g. to annotate a different anomalous release), it would be silently dropped the next time the script runs. This mirrors a real data-loss scenario: the PR itself was created specifically because a block was silently dropped once before. Consider using .matchAll() and concatenating all matched blocks, or at least documenting the single-block constraint as a comment near this code.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d1623f6 — switched to matchAll() and join all matched blocks with blank lines, so any number of NOTES_START/NOTES_END blocks survive regeneration. Added a regression test covering two distinct NOTES blocks (both delimiter pairs preserved, both body texts present after roundtrip).

If multiple NOTES_START/NOTES_END blocks are added to annotate different
anomalous releases, .match() silently dropped all but the first — the same
data-loss class this PR was created to prevent. Switch to .matchAll() and
join with blank lines so every block survives regeneration. Add a test
covering two distinct NOTES blocks.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

follow-up: preserve manual NOTES block in incremental benchmark generator

1 participant