Skip to content

W-21683590: stabilize Node 20 CI memory usage#279

Closed
kylewalke wants to merge 19 commits intomainfrom
kdev/fix-oom-node20-heap
Closed

W-21683590: stabilize Node 20 CI memory usage#279
kylewalke wants to merge 19 commits intomainfrom
kdev/fix-oom-node20-heap

Conversation

@kylewalke
Copy link
Copy Markdown
Contributor

@kylewalke kylewalke commented Mar 24, 2026

Summary

  • Run npm test instead of npm run test:coverage on the 20.x matrix leg to avoid the coverage-specific memory blowup
  • Set NODE_OPTIONS=--max-old-space-size=8192 only for the Node 20 non-coverage step so heavy stdlib suites can complete under the lower Node 20 heap ceiling
  • Keep coverage collection and artifact upload on the lts/* and node matrix legs where those jobs already run reliably

Work Item

@W-21683590@

Test plan

  • Test (ubuntu-latest, 20.x) passes
  • Test (windows-latest, 20.x) passes
  • Existing non-20.x CI legs still pass

peternhale and others added 14 commits March 20, 2026 14:14
Replaces line-number-based IDs with stable qualified name format to
prevent cross-file reference invalidation when files are edited.

New ID format:
- Uses # separator instead of : to avoid URI confusion
- Dot-qualified names (e.g., file:///MyClass.cls#MyClass.myMethod)
- Optional method signatures for overload disambiguation (#(String,Integer))
- No line numbers - IDs remain stable across edits

Changes:
- Updated UriBasedIdGenerator to generate stable IDs with parameters
- Updated ApexSymbolCollectorListener to extract and pass parameter info
- Updated VisibilitySymbolListener similarly
- Modified SymbolFactory.generateId() to accept parameters and namespace
- Updated scope path cleaning to remove implementation details (kinds, duplicates)
- Fixed all test expectations to match new stable ID format

Benefits:
- Cross-file references remain valid when source files are edited
- Method overloads have unique IDs via signature disambiguation
- Simpler, more readable ID format
- Backward compatible parsing of old format
… indexes

Replace DirectedGraph<ReferenceNode, ReferenceEdge> with lightweight indexes for O(1) reference lookups:
- reverseIndex: Map<targetSymbolId, Set<refKey>> for findReferencesTo
- forwardIndex: Map<sourceFileUri, Set<refKey>> for file cleanup
- refStore: Map<refKey, RefStoreEntry> for full reference details
- refKey format: {sourceFileUri}:{sourceSymbolId}:{refIndex}

Benefits:
- O(1) lookups instead of O(S×N) graph scans
- Built from SymbolTable.getAllReferences() (single source of truth)
- Duplicate reference detection
- Maintains backward compatibility for visualization

Updated methods in ApexSymbolGraph: addSymbol, addReferenceToGraph, findReferencesTo, findReferencesFrom, removeFile, detectCircularDependencies, and all deferred reference processing methods.

Updated extractGraphData.ts to use new indexes for graph visualization.

All 2,682 tests passing.
…traction

Extract repeated code patterns into reusable helper methods:

1. ApexSymbolGraph.ts:
   - createAndAddReference(): Consolidates 5 duplicate blocks (~100 lines)
     that created RefStoreEntry and added to indexes
   - Used in: addReferenceToGraph, all deferred reference processing methods

2. extractGraphData.ts:
   - createGraphNode(): Consolidates 3 duplicate blocks (~60 lines)
     that calculated reference counts and built GraphNode structures
   - Used in: getAllNodesEffect, getGraphDataForFile, getGraphDataByType

Total reduction: 167 lines removed, 113 lines added (net -54 lines)

All 2,682 tests passing.
…eyword

- Define inTypeSymbolGroup in symbol.ts; re-export from symbolNarrowing; FQNUtils.isType delegates\n- Replace duplicate type-kind checks with inTypeSymbolGroup across listeners and ApexSymbolManager\n- Extract applyModifierKeyword shared by Visibility and ApexSymbol collectors\n- Add jscpd (check:dupes, parser-ast scoped scripts); track .jscpd.json; verification skill

Made-with: Cursor
…GSEGV

Parallel workers loading protobuf/stdlib could segfault; recycle workers at 512MB idle heap.

Made-with: Cursor
…andling

W-21638371

Replace UriProtocol union, getProtocolType, hasProtocol, isUserCodeUri,
and PROTOCOL_PREFIXES with a single hasUriScheme boolean check. Only
apexlib:// is special-cased (in getFilePathFromUri); file://, memfs:,
and all other schemes are pass-through.

Remove dead exports: createBuiltinUri, extractBuiltinType, isBuiltinUri,
convertToAppropriateUri, BUILTIN_URI_PREFIX, isUserCodeId.

Made-with: Cursor
…lManager

createFileUri already checks for existing schemes internally, so the
hasUriScheme(x) ? x : createFileUri(x) pattern was redundant.

Made-with: Cursor
…naming

- Use apexlib:// for void/null synthetic symbols; graph virtual IDs via generateSymbolId
- Rename resolveBuiltInType, findBuiltInType, FQNUtils helpers; add isStandardLibraryTypeInfo
- Deprecate proto is_built_in naming in comments; ConfigurationSummary defaultDocumentSchemes
- Cap Jest maxWorkers to reduce worker SIGSEGV flakiness

Made-with: Cursor
@vscode/test-web only selects the stable web build when quality is the
literal 'stable'. Passing version 1.108.0 downloaded latest Insider instead.
Use quality: stable for semver pins from readLocalVSCodeVersion().

Made-with: Cursor
…p-compliant-services

4 workers × ~2 GB stdlib each saturates the 16 GB CI runner, causing OOM crashes
in HoverProcessingService.integration.test.ts and ApexSymbolManager.resolution.test.ts.

Changes:
- apex-parser-ast: reduce maxWorkers from 4 to 2
- lsp-compliant-services: add workerIdleMemoryLimit 512MB, maxWorkers 2, testTimeout 120s

Made-with: Cursor
Node 20.x default V8 heap cap (~4 GB) is too low for heavy stdlib-loading
test suites (HoverProcessingService, ApexSymbolManager.resolution). The 16 GB
CI runners have capacity but the per-process limit triggers OOM before memory
is actually exhausted.

Made-with: Cursor
@kylewalke kylewalke requested a review from a team as a code owner March 24, 2026 02:15
@kylewalke kylewalke requested a review from diyer March 24, 2026 02:15
@kylewalke kylewalke changed the base branch from phale/stable-ids to main March 24, 2026 02:16
@kylewalke kylewalke changed the title ci: set NODE_OPTIONS --max-old-space-size=8192 for test:coverage step [W-21683590] ci: set NODE_OPTIONS --max-old-space-size=8192 for test:coverage step Mar 24, 2026
With maxWorkers:2, both workers stay continuously busy — the worker never
becomes idle between suites so workerIdleMemoryLimit never fires. Memory
accumulates across all suites in one long-lived worker and OOMs at ~8 GB
on Node 20.x.

With maxWorkers:1, the single worker briefly becomes idle between suites,
letting workerIdleMemoryLimit recycle it at 512 MB. Each fresh worker only
holds stdlib + one suite at a time (~2 GB peak vs ~8 GB accumulated).

Revert the NODE_OPTIONS --max-old-space-size change (higher limit defers
GC but doesn't reduce total retention; the real fix is worker recycling).

Made-with: Cursor
@github-actions
Copy link
Copy Markdown

E2E Test Results Summary

Run: 273

Test Results

  • Passed: 74
  • Failed: 0
  • Errors: 0
  • Total: 74

Passing Rate by File

  • apex-hover.spec.ts/junit.xml: 100.0% (21/21 passed)
  • apex-lsp-integration.spec.ts/junit.xml: 100.0% (16/16 passed)
  • apex-goto-definition.spec.ts/junit.xml: 100.0% (29/29 passed)
  • apex-outline.spec.ts/junit.xml: 0.0% (0/0 passed)
  • apex-extension-core.spec.ts/junit.xml: 100.0% (8/8 passed)

Artifacts

View detailed test reports in the Artifacts section.

All tests passed

Node 20.x consistently OOMs in coverage runs for heavy stdlib suites, while lts/*
and node matrix legs pass. Run `npm test` on 20.x and keep `test:coverage`
for the other matrix versions so CI remains stable while still collecting
coverage artifacts.

Made-with: Cursor
@github-actions
Copy link
Copy Markdown

E2E Test Results Summary

Run: 275

Test Results

  • Passed: 64
  • Failed: 0
  • Errors: 0
  • Total: 64

Passing Rate by File

  • apex-hover.spec.ts/junit.xml: 0.0% (0/0 passed)
  • apex-lsp-integration.spec.ts/junit.xml: 100.0% (16/16 passed)
  • apex-goto-definition.spec.ts/junit.xml: 100.0% (29/29 passed)
  • apex-outline.spec.ts/junit.xml: 100.0% (11/11 passed)
  • apex-extension-core.spec.ts/junit.xml: 100.0% (8/8 passed)

Artifacts

View detailed test reports in the Artifacts section.

All tests passed

After switching 20.x to `npm test`, lsp-compliant-services still OOMs at Node 20's
~4 GB default old-space limit. Set NODE_OPTIONS=--max-old-space-size=6144 only for
the 20.x non-coverage step to keep memory below the prior 8 GB regression while
unblocking heavy stdlib test suites.

Made-with: Cursor
@kylewalke kylewalke changed the title [W-21683590] ci: set NODE_OPTIONS --max-old-space-size=8192 for test:coverage step W-21683590: stabilize Node 20 CI memory usage Mar 24, 2026
@github-actions
Copy link
Copy Markdown

E2E Test Results Summary

Run: 278

Test Results

  • Passed: 56
  • Failed: 0
  • Errors: 0
  • Total: 56

Passing Rate by File

  • apex-hover.spec.ts/junit.xml: 100.0% (21/21 passed)
  • apex-lsp-integration.spec.ts/junit.xml: 100.0% (16/16 passed)
  • apex-goto-definition.spec.ts/junit.xml: 0.0% (0/0 passed)
  • apex-outline.spec.ts/junit.xml: 100.0% (11/11 passed)
  • apex-extension-core.spec.ts/junit.xml: 100.0% (8/8 passed)

Artifacts

View detailed test reports in the Artifacts section.

All tests passed

The scoped 6 GB heap bump still OOMs in HoverProcessingService on both ubuntu
and windows. Increase the Node 20 non-coverage step to 8 GB so the heavy stdlib
integration suites can finish, while keeping coverage disabled only on 20.x.

Made-with: Cursor
@github-actions
Copy link
Copy Markdown

E2E Test Results Summary

Run: 279

Test Results

  • Passed: 64
  • Failed: 0
  • Errors: 0
  • Total: 64

Passing Rate by File

  • apex-hover.spec.ts/junit.xml: 0.0% (0/0 passed)
  • apex-lsp-integration.spec.ts/junit.xml: 100.0% (16/16 passed)
  • apex-goto-definition.spec.ts/junit.xml: 100.0% (29/29 passed)
  • apex-outline.spec.ts/junit.xml: 100.0% (11/11 passed)
  • apex-extension-core.spec.ts/junit.xml: 100.0% (8/8 passed)

Artifacts

View detailed test reports in the Artifacts section.

All tests passed

@peternhale peternhale removed the request for review from diyer March 24, 2026 15:10
@github-actions
Copy link
Copy Markdown

E2E Test Results Summary

Run: 284

Test Results

  • Passed: 85
  • Failed: 0
  • Errors: 0
  • Total: 85

Passing Rate by File

  • apex-hover.spec.ts/junit.xml: 100.0% (21/21 passed)
  • apex-lsp-integration.spec.ts/junit.xml: 100.0% (16/16 passed)
  • apex-goto-definition.spec.ts/junit.xml: 100.0% (29/29 passed)
  • apex-outline.spec.ts/junit.xml: 100.0% (11/11 passed)
  • apex-extension-core.spec.ts/junit.xml: 100.0% (8/8 passed)

Artifacts

View detailed test reports in the Artifacts section.

All tests passed

@kylewalke kylewalke closed this Mar 24, 2026
@kylewalke kylewalke deleted the kdev/fix-oom-node20-heap branch March 24, 2026 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants