Plugin Version
v1.1.0-beta.8
OpenClaw Version
v2026.3.8
Bug Description
When a Discord message contains a large text file attachment (14KB+ Chinese .md file), embedSingle() enters an infinite recursion loop, producing ~50,000 embedding error calls in 12 minutes and completely blocking the Node.js event loop. All agents on the gateway become unresponsive.
Root Cause
The recursion occurs because of a token/character mismatch in smartChunk():
EMBEDDING_CONTEXT_LIMITS["nomic-embed-text"] = 8192 (tokens)
smartChunk() uses this as character limit: maxChunkSize = 8192 * 0.7 = 5740 chars
- For CJK text, each character ≈ 2-3 tokens → 5740 Chinese chars ≈ 14,000 tokens → still exceeds 8192 token limit
- The chunk is already ≤ 5740 chars, so
chunkDocument() cannot split it further
smartChunk() returns 1 chunk identical to the input (log: Split document into 1 chunks)
embedSingle() recursively calls itself with the same text → same failure → infinite loop
Three defects:
- Token/Character mismatch:
smartChunk() treats token limits as character limits. CJK characters use 2-3x more tokens than ASCII.
- No recursion depth guard:
embedSingle() has no depth parameter to detect and break recursion.
- No single-chunk detection: When
smartChunk() returns 1 chunk ≈ same size as input, embedSingle() should detect this as "chunking failed" and truncate instead of recursing.
Expected Behavior
- Large text inputs should be chunked correctly (considering token/char ratio for CJK)
- If chunking cannot reduce the text,
embedSingle() should truncate and embed (not recurse infinitely)
- A single embedding operation should not block the event loop for more than 10 seconds
Steps to Reproduce
- Configure plugin with Ollama
nomic-embed-text (8192 token context)
- Enable
autoRecall: true
- Send a message to the agent containing a 14KB+ Chinese text file attachment
- Observe gateway.err.log flooding with
Document exceeded context limit errors
- All agents become unresponsive; only a gateway restart recovers
Error Logs
2026-03-15T02:36:54+08:00 Document exceeded context limit (400 the input length exceeds the context length), attempting chunking...
2026-03-15T02:36:54+08:00 Split document into 1 chunks for embedding
2026-03-15T02:36:54+08:00 Document exceeded context limit (400 the input length exceeds the context length), attempting chunking...
2026-03-15T02:36:54+08:00 Split document into 1 chunks for embedding
... (49,979 times in 12 minutes, ~70/second)
Embedding Provider
- Provider: Ollama (local)
- Model:
nomic-embed-text (768 dimensions, 8192 token context)
- URL:
http://localhost:11434/v1
OS/Platform
- macOS Darwin 25.1.0 (Apple Silicon)
- Node.js 22.21.0
Suggested Fix
We've implemented a local fix with 5 changes across 3 files:
src/embedder.ts: Add depth parameter to embedSingle() (max depth=3, then truncate)
src/embedder.ts: Detect single-chunk output (chunks.length===1 && same size) → truncate instead of recurse
src/embedder.ts: Add 10s timeout wrapper on embedPassage()/embedQuery()
src/chunker.ts: CJK-aware chunk sizing in smartChunk() — divide char limit by 2.5 when CJK ratio > 30%
index.ts: Truncate auto-recall query to 1000 chars before embedding
Will submit a PR with these changes.
Plugin Version
v1.1.0-beta.8
OpenClaw Version
v2026.3.8
Bug Description
When a Discord message contains a large text file attachment (14KB+ Chinese
.mdfile),embedSingle()enters an infinite recursion loop, producing ~50,000 embedding error calls in 12 minutes and completely blocking the Node.js event loop. All agents on the gateway become unresponsive.Root Cause
The recursion occurs because of a token/character mismatch in
smartChunk():EMBEDDING_CONTEXT_LIMITS["nomic-embed-text"]= 8192 (tokens)smartChunk()uses this as character limit:maxChunkSize = 8192 * 0.7 = 5740 charschunkDocument()cannot split it furthersmartChunk()returns 1 chunk identical to the input (log:Split document into 1 chunks)embedSingle()recursively calls itself with the same text → same failure → infinite loopThree defects:
smartChunk()treats token limits as character limits. CJK characters use 2-3x more tokens than ASCII.embedSingle()has nodepthparameter to detect and break recursion.smartChunk()returns 1 chunk ≈ same size as input,embedSingle()should detect this as "chunking failed" and truncate instead of recursing.Expected Behavior
embedSingle()should truncate and embed (not recurse infinitely)Steps to Reproduce
nomic-embed-text(8192 token context)autoRecall: trueDocument exceeded context limiterrorsError Logs
Embedding Provider
nomic-embed-text(768 dimensions, 8192 token context)http://localhost:11434/v1OS/Platform
Suggested Fix
We've implemented a local fix with 5 changes across 3 files:
src/embedder.ts: Adddepthparameter toembedSingle()(max depth=3, then truncate)src/embedder.ts: Detect single-chunk output (chunks.length===1 && same size) → truncate instead of recursesrc/embedder.ts: Add 10s timeout wrapper onembedPassage()/embedQuery()src/chunker.ts: CJK-aware chunk sizing insmartChunk()— divide char limit by 2.5 when CJK ratio > 30%index.ts: Truncate auto-recall query to 1000 chars before embeddingWill submit a PR with these changes.