Skip to content

fix: prevent infinite recursion in embedSingle() for CJK text#215

Closed
RooikeCAO wants to merge 1 commit intoCortexReach:masterfrom
RooikeCAO:fix/embedding-infinite-recursion-cjk
Closed

fix: prevent infinite recursion in embedSingle() for CJK text#215
RooikeCAO wants to merge 1 commit intoCortexReach:masterfrom
RooikeCAO:fix/embedding-infinite-recursion-cjk

Conversation

@RooikeCAO
Copy link
Copy Markdown
Contributor

Summary

  • Fix infinite recursion in embedSingle() when smartChunk() returns a single chunk equal to input size
  • Add CJK-aware chunk sizing to handle token/character ratio mismatch
  • Add recursion depth limit, global timeout, and auto-recall query truncation

Closes #214

Problem

When a large CJK text file (14KB+ Chinese .md) is processed by auto-recall, embedSingle() enters an infinite recursion loop producing ~50,000 embedding errors in 12 minutes, blocking the Node.js event loop and making all agents unresponsive.

Root cause: smartChunk() treats the token limit (8192) as a character limit, but CJK characters use 2-3 tokens each. A chunk of 5740 Chinese characters ≈ 14,000 tokens, still exceeding the 8192 token limit. Since the chunk is already ≤ maxChunkSize in characters, the chunker returns 1 chunk identical to the input → embedSingle() recurses infinitely.

Changes (3 files, 125 insertions)

src/embedder.ts

  • Recursion depth limit: Add depth parameter to embedSingle(), max depth 3, then force-truncate
  • Single-chunk detection: When smartChunk() returns 1 chunk ≈ same size as input, truncate instead of recursing
  • Global timeout: 10s Promise.race wrapper on embedPassage()/embedQuery()
  • getSafeCharLimit(): Returns safe character count per model (2300 for nomic-embed-text considering CJK)

src/chunker.ts

  • CJK-aware sizing: Detect CJK character ratio; when >30%, divide maxChunkSize by 2.5
  • For nomic-embed-text: 5734 chars → 2293 chars (≈5732 tokens, safely under 8192 limit)

index.ts

  • Auto-recall truncation: Truncate event.prompt to 1000 chars before embedding (only intent needed for memory search, not full attachment text)

Test plan

  • 14KB Chinese .md file embedding completes in <10s (was: infinite loop)
  • No Document exceeded context limit log flooding after fix
  • Gateway remains responsive during large text embedding
  • Non-CJK text behavior unchanged (chunk sizes identical to before)
  • Auto-recall with truncated query still returns relevant memories
  • Existing test embedder-error-hints.test.mjs passes

🤖 Generated with Claude Code

When a large CJK text (14KB+ Chinese .md file) is processed by
auto-recall, embedSingle() enters an infinite recursion loop because:

1. smartChunk() treats token limits as character limits, but CJK
   characters use 2-3x more tokens than ASCII characters
2. Chunks of 5740 chars (70% of 8192 token limit) still exceed
   the model's token context for CJK text
3. smartChunk() returns 1 chunk identical to input → embedSingle()
   recurses with the same text → infinite loop

This produced ~50,000 embedding errors in 12 minutes, blocking
the entire Node.js event loop and making all agents unresponsive.

Fixes:
- Add recursion depth limit (max 3) to embedSingle() with forced
  truncation as fallback
- Detect single-chunk output (same size as input) and truncate
  instead of recursing
- Add CJK-aware chunk sizing in smartChunk() (divide char limit
  by 2.5 when CJK ratio > 30%)
- Truncate auto-recall query to 1000 chars before embedding
- Add 10s global timeout on embedPassage()/embedQuery()

Closes CortexReach#214

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Mar 16, 2026

Thanks for chasing this. I agree the bug is real, and the overall direction makes sense. I was able to reproduce the current failure mode on master: with a mocked embedding provider, a ~4000-char Chinese input can trigger smartChunk() returning a single chunk equal to the original input, and embedSingle() then recurses indefinitely. In my local repro it was able to issue >1000 embedding requests in ~300ms, so this is definitely worth fixing.

That said, I don’t think we should merge this as-is yet. I see two blocking issues:

  1. The new timeout wrapper in src/embedder.ts is not safe in its current form. Promise.race() + setTimeout() changes the surfaced result, but it does not cancel the underlying embedding request / recursion, and it does not clear the timer. In practice this means the work can continue in the background after the caller sees a timeout, and even successful calls may still be held open by the pending timer. If we want a timeout here, it should be implemented with actual cancellation (AbortController / AbortSignal.timeout() or equivalent cleanup).

  2. The recursion fallback still is not guaranteed to make progress for all supported models. This PR fixes the reported nomic-embed-text + CJK case, but the current depth / safeLimit / chunk-size heuristics still leave failure paths for small-context models. For example, with a 512-token model like all-MiniLM-L6-v2, smartChunk() can still produce a single chunk equal to the input, and the forced truncate path is not guaranteed to shrink enough to break the recursion condition. We need a monotonic fallback: each retry must strictly reduce input length, or fail fast instead of recursing again.

Non-blocking note:

  • I understand the motivation for truncating auto-recall queries in index.ts, but slice(0, 1000) is fairly blunt. If we keep that change, I’d like to see coverage showing it doesn’t noticeably hurt recall quality for long prompts / attachment-heavy inputs.

Before merge, I’d want regression coverage for:

  • nomic-embed-text + long CJK input no longer loops
  • a small-context model (for example all-MiniLM-L6-v2) also terminates correctly
  • successful embeddings are not delayed by the timeout path

So my view is: good bug to fix, good direction, but this needs another revision before merge.

Hi-Jiajun added a commit to Hi-Jiajun/memory-lancedb-pro that referenced this pull request Mar 16, 2026
…e (PR CortexReach#215 follow-up)

This commit addresses the two blocking issues raised in PR CortexReach#215:

1. Timeout now uses AbortController for TRUE request cancellation
   - Timer is properly cleaned up in .finally()
   - AbortSignal is passed through to embedWithRetry

2. Recursion now guarantees monotonic convergence
   - Introduced STRICT_REDUCTION_FACTOR = 0.5
   - Each recursion level must reduce input by 50%
   - Works regardless of model context size

Modified by AI assistant (not human code) based on PR CortexReach#215.
Thanks to the original author and maintainers.

Co-authored-by: Hi-Jiajun <Hi-Jiajun@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix — the defense-in-depth approach (CJK-aware chunking + single-chunk detection + depth limit) is well-designed.

However, since PR #238 includes this PR's full content as its first commit plus additional improvements (AbortController timeout, convergence guarantee), we'll review and merge #238 as the replacement. This PR will be closed when #238 lands.

No action needed on your side for this PR.

Hi-Jiajun pushed a commit to Hi-Jiajun/memory-lancedb-pro that referenced this pull request Mar 17, 2026
…ortexReach#238)

- Test single-chunk detection (force-reduce when chunk >= 90% of original)
- Test depth limit termination (depth >= MAX_EMBED_DEPTH throws)
- Test CJK-aware chunk sizing (>30% CJK -> smaller chunks)
- Test strict reduction factor (50% per recursion level)
- Test batch embedding works correctly
@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Mar 17, 2026

Closing — the underlying issue is addressed by #226 (ACL/scope filter separation) and #238 (follow-up fixes). Thanks for the contribution!

@rwmjhb rwmjhb closed this Mar 17, 2026
rwmjhb added a commit that referenced this pull request Mar 18, 2026
fix: prevent infinite recursion in embedSingle() for CJK text (replaces PR #215)
AliceLJY pushed a commit to AliceLJY/memory-lancedb-pro that referenced this pull request Mar 19, 2026
…e (PR CortexReach#215 follow-up)

This commit addresses the two blocking issues raised in PR CortexReach#215:

1. Timeout now uses AbortController for TRUE request cancellation
   - Timer is properly cleaned up in .finally()
   - AbortSignal is passed through to embedWithRetry

2. Recursion now guarantees monotonic convergence
   - Introduced STRICT_REDUCTION_FACTOR = 0.5
   - Each recursion level must reduce input by 50%
   - Works regardless of model context size

Modified by AI assistant (not human code) based on PR CortexReach#215.
Thanks to the original author and maintainers.

Co-authored-by: Hi-Jiajun <Hi-Jiajun@users.noreply.github.com>
AliceLJY pushed a commit to AliceLJY/memory-lancedb-pro that referenced this pull request Mar 19, 2026
…ortexReach#238)

- Test single-chunk detection (force-reduce when chunk >= 90% of original)
- Test depth limit termination (depth >= MAX_EMBED_DEPTH throws)
- Test CJK-aware chunk sizing (>30% CJK -> smaller chunks)
- Test strict reduction factor (50% per recursion level)
- Test batch embedding works correctly
Papyrus0 pushed a commit to Papyrus0/memory-lancedb-pro-fork that referenced this pull request Mar 20, 2026
…e (PR CortexReach#215 follow-up)

This commit addresses the two blocking issues raised in PR CortexReach#215:

1. Timeout now uses AbortController for TRUE request cancellation
   - Timer is properly cleaned up in .finally()
   - AbortSignal is passed through to embedWithRetry

2. Recursion now guarantees monotonic convergence
   - Introduced STRICT_REDUCTION_FACTOR = 0.5
   - Each recursion level must reduce input by 50%
   - Works regardless of model context size

Modified by AI assistant (not human code) based on PR CortexReach#215.
Thanks to the original author and maintainers.

Co-authored-by: Hi-Jiajun <Hi-Jiajun@users.noreply.github.com>
Papyrus0 pushed a commit to Papyrus0/memory-lancedb-pro-fork that referenced this pull request Mar 20, 2026
…ortexReach#238)

- Test single-chunk detection (force-reduce when chunk >= 90% of original)
- Test depth limit termination (depth >= MAX_EMBED_DEPTH throws)
- Test CJK-aware chunk sizing (>30% CJK -> smaller chunks)
- Test strict reduction factor (50% per recursion level)
- Test batch embedding works correctly
Papyrus0 pushed a commit to Papyrus0/memory-lancedb-pro-fork that referenced this pull request Mar 20, 2026
fix: prevent infinite recursion in embedSingle() for CJK text (replaces PR CortexReach#215)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Infinite recursion in embedSingle() when smartChunk produces single chunk equal to input

3 participants