Skip to content

Commit 5d27110

Browse files
committed
fix: correct buffer handling for base64 embedding decoding
- Fixed unsafe buffer handling that could cause dimension truncation - Use DataView with proper byte order handling for Float32Array conversion - This prevents reading beyond buffer boundaries and data corruption - Affects all models using base64 encoding, not just Gemini The previous implementation used buffer.buffer directly which could: 1. Read from wrong memory locations if buffer was a view 2. Cause dimension truncation for large embeddings (like 3072-dim) 3. Result in incorrect embedding values Fixes #7348
1 parent 32fc3d6 commit 5d27110

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

src/services/code-index/embedders/openai-compatible.ts

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,8 +278,15 @@ export class OpenAICompatibleEmbedder implements IEmbedder {
278278
if (typeof item.embedding === "string") {
279279
const buffer = Buffer.from(item.embedding, "base64")
280280

281-
// Create Float32Array view over the buffer
282-
const float32Array = new Float32Array(buffer.buffer, buffer.byteOffset, buffer.byteLength / 4)
281+
// Safe approach: Create Float32Array from a properly aligned copy
282+
// This avoids issues with Node.js Buffers that may be views into larger ArrayBuffers
283+
const float32Array = new Float32Array(buffer.length / 4)
284+
const dataView = new DataView(buffer.buffer, buffer.byteOffset, buffer.byteLength)
285+
286+
// Read floats with proper byte order handling (little-endian)
287+
for (let i = 0; i < float32Array.length; i++) {
288+
float32Array[i] = dataView.getFloat32(i * 4, true)
289+
}
283290

284291
return {
285292
...item,

0 commit comments

Comments
 (0)