Plugin Version
1.1.0-beta.9
OpenClaw Version
3.13
Bug Description
Summary
When embedding very long CJK-heavy text, memory-lancedb-pro can enter an infinite chunking/retry loop after the provider returns:
400 Invalid 'input': maximum context length is 8192 tokens.
This can quickly cause a request storm.
Environment
- Repo:
CortexReach/memory-lancedb-pro
- Version tested:
1.1.0-beta.9
- Branch/commit tested:
master / 2ebba8e6b7b65bf38336199384d5ec8690701f6e
- Embedding model:
text-embedding-3-small
- Provider: OpenAI-compatible API
- Config:
{
"embedding": {
"model": "text-embedding-3-small",
"chunking": true
}
}
What happens
For some CJK-heavy inputs, the first embedding request correctly fails with a context-length error, and the plugin starts chunking.
However, one of the generated chunks can still be too large for the provider in token terms, while smartChunk() returns exactly 1 chunk of the same size for that chunk.
That leads to repeated recursion:
- embed oversized chunk
- get 400 maximum context length
- chunk again
- still get 1 chunk with the same size
- recurse again
Expected behavior
If chunking does not actually reduce the failing chunk, the plugin should stop and return an error instead of recursively retrying.
Suspected root cause
src/embedder.ts retries chunking on context errors, but there is no guard for:
- chunkResult.chunks.length === 1
- and the chunk is effectively unchanged from the original failing input
This seems especially easy to trigger with CJK-heavy text because character-count heuristics are not a safe proxy for token count.
Suggested fix
Before recursively retrying chunk embeddings, add a guard like:
- if chunk count is 1
- and the chunk length is unchanged (or not meaningfully reduced)
then abort and throw the original context-length error instead of retrying again.
Additional note
This is not just a "quota exceeded" symptom. In our testing, the initial trigger is a real provider 400 maximum context length, and the request storm is the secondary
effect.
Expected Behavior
If chunking does not actually reduce the failing chunk, the plugin should stop and return an error instead of recursively retrying.
Steps to Reproduce
Reproduction notes
Input that reproduced reliably:
Observed chunking behavior from smartChunk(text, "text-embedding-3-small"):
{"len":12000,"chunkCount":3,"chunkLengths":[5734,5734,1350]}
{"len":5734,"chunkCount":1,"chunkLengths":[5734]}
This appears to be the core issue:
- 5734 chars may still exceed the provider's token limit for CJK-heavy text
- but chunking 5734 chars returns the same single 5734-char chunk
- so the embedder retries recursively forever
Error Logs / Screenshots
## Minimal observed behavior
Real provider logs:
Document exceeded context limit (400 Invalid 'input': maximum context length is 8192 tokens.), attempting chunking...
Split document into 3 chunks for embedding
Document exceeded context limit (400 Invalid 'input': maximum context length is 8192 tokens.), attempting chunking...
Split document into 1 chunks for embedding
Document exceeded context limit (400 Invalid 'input': maximum context length is 8192 tokens.), attempting chunking...
Split document into 1 chunks for embedding
In a 5-second controlled test, this produced:
- 75 embedding requests
- repeated request lengths only among:
- 12000
- 5734
- 1350
Embedding Provider
OpenAI
OS / Platform
Ubuntu
Plugin Version
1.1.0-beta.9
OpenClaw Version
3.13
Bug Description
Summary
When embedding very long CJK-heavy text,
memory-lancedb-procan enter an infinite chunking/retry loop after the provider returns:400 Invalid 'input': maximum context length is 8192 tokens.This can quickly cause a request storm.
Environment
CortexReach/memory-lancedb-pro1.1.0-beta.9master/2ebba8e6b7b65bf38336199384d5ec8690701f6etext-embedding-3-small{ "embedding": { "model": "text-embedding-3-small", "chunking": true } }What happens
For some CJK-heavy inputs, the first embedding request correctly fails with a context-length error, and the plugin starts chunking.
However, one of the generated chunks can still be too large for the provider in token terms, while smartChunk() returns exactly 1 chunk of the same size for that chunk.
That leads to repeated recursion:
Expected behavior
If chunking does not actually reduce the failing chunk, the plugin should stop and return an error instead of recursively retrying.
Suspected root cause
src/embedder.ts retries chunking on context errors, but there is no guard for:
This seems especially easy to trigger with CJK-heavy text because character-count heuristics are not a safe proxy for token count.
Suggested fix
Before recursively retrying chunk embeddings, add a guard like:
then abort and throw the original context-length error instead of retrying again.
Additional note
This is not just a "quota exceeded" symptom. In our testing, the initial trigger is a real provider 400 maximum context length, and the request storm is the secondary
effect.
Expected Behavior
If chunking does not actually reduce the failing chunk, the plugin should stop and return an error instead of recursively retrying.
Steps to Reproduce
Reproduction notes
Input that reproduced reliably:
Observed chunking behavior from smartChunk(text, "text-embedding-3-small"):
{"len":12000,"chunkCount":3,"chunkLengths":[5734,5734,1350]}
{"len":5734,"chunkCount":1,"chunkLengths":[5734]}
This appears to be the core issue:
Error Logs / Screenshots
Embedding Provider
OpenAI
OS / Platform
Ubuntu