[Bug] Infinite recursion in embedSingle() when smartChunk produces single chunk equal to input

## Plugin Version
v1.1.0-beta.8

## OpenClaw Version
v2026.3.8

## Bug Description

When a Discord message contains a large text file attachment (14KB+ Chinese `.md` file), `embedSingle()` enters an infinite recursion loop, producing ~50,000 embedding error calls in 12 minutes and completely blocking the Node.js event loop. All agents on the gateway become unresponsive.

### Root Cause

The recursion occurs because of a **token/character mismatch** in `smartChunk()`:

1. `EMBEDDING_CONTEXT_LIMITS["nomic-embed-text"]` = 8192 (tokens)
2. `smartChunk()` uses this as **character limit**: `maxChunkSize = 8192 * 0.7 = 5740 chars`
3. For CJK text, each character ≈ 2-3 tokens → 5740 Chinese chars ≈ 14,000 tokens → **still exceeds** 8192 token limit
4. The chunk is already ≤ 5740 chars, so `chunkDocument()` cannot split it further
5. `smartChunk()` returns **1 chunk identical to the input** (log: `Split document into 1 chunks`)
6. `embedSingle()` recursively calls itself with the same text → same failure → infinite loop

### Three defects:

1. **Token/Character mismatch**: `smartChunk()` treats token limits as character limits. CJK characters use 2-3x more tokens than ASCII.
2. **No recursion depth guard**: `embedSingle()` has no `depth` parameter to detect and break recursion.
3. **No single-chunk detection**: When `smartChunk()` returns 1 chunk ≈ same size as input, `embedSingle()` should detect this as "chunking failed" and truncate instead of recursing.

## Expected Behavior

- Large text inputs should be chunked correctly (considering token/char ratio for CJK)
- If chunking cannot reduce the text, `embedSingle()` should truncate and embed (not recurse infinitely)
- A single embedding operation should not block the event loop for more than 10 seconds

## Steps to Reproduce

1. Configure plugin with Ollama `nomic-embed-text` (8192 token context)
2. Enable `autoRecall: true`
3. Send a message to the agent containing a 14KB+ Chinese text file attachment
4. Observe gateway.err.log flooding with `Document exceeded context limit` errors
5. All agents become unresponsive; only a gateway restart recovers

## Error Logs

```
2026-03-15T02:36:54+08:00 Document exceeded context limit (400 the input length exceeds the context length), attempting chunking...
2026-03-15T02:36:54+08:00 Split document into 1 chunks for embedding
2026-03-15T02:36:54+08:00 Document exceeded context limit (400 the input length exceeds the context length), attempting chunking...
2026-03-15T02:36:54+08:00 Split document into 1 chunks for embedding
... (49,979 times in 12 minutes, ~70/second)
```

## Embedding Provider

- Provider: Ollama (local)
- Model: `nomic-embed-text` (768 dimensions, 8192 token context)
- URL: `http://localhost:11434/v1`

## OS/Platform

- macOS Darwin 25.1.0 (Apple Silicon)
- Node.js 22.21.0

## Suggested Fix

We've implemented a local fix with 5 changes across 3 files:

1. **`src/embedder.ts`**: Add `depth` parameter to `embedSingle()` (max depth=3, then truncate)
2. **`src/embedder.ts`**: Detect single-chunk output (chunks.length===1 && same size) → truncate instead of recurse
3. **`src/embedder.ts`**: Add 10s timeout wrapper on `embedPassage()`/`embedQuery()`
4. **`src/chunker.ts`**: CJK-aware chunk sizing in `smartChunk()` — divide char limit by 2.5 when CJK ratio > 30%
5. **`index.ts`**: Truncate auto-recall query to 1000 chars before embedding

Will submit a PR with these changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Infinite recursion in embedSingle() when smartChunk produces single chunk equal to input #214

Plugin Version

OpenClaw Version

Bug Description

Root Cause

Three defects:

Expected Behavior

Steps to Reproduce

Error Logs

Embedding Provider

OS/Platform

Suggested Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Infinite recursion in embedSingle() when smartChunk produces single chunk equal to input #214

Description

Plugin Version

OpenClaw Version

Bug Description

Root Cause

Three defects:

Expected Behavior

Steps to Reproduce

Error Logs

Embedding Provider

OS/Platform

Suggested Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions