-
Notifications
You must be signed in to change notification settings - Fork 37
Description
I'm working on a webserver that let's users upload large numbers of documents and I've run into a bottleneck, the js server creates async requests to the qdrant docker container, most of them work, but it seems to get overwhelmed when the client continues to send file upload requests. I've also noticed that this seems to lock up other system resources like mongodb connections. Our server is able to process and store the files just fine, the source of the error is strictly the qdrant .upsert(...) call:
TypeError: fetch failed
at node:internal/deps/undici/undici:12500:13
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async fetchJson (/home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/openapi-typescript-fetch/dist/cjs/fetcher.js:135:22)
at async /home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/js-client-rest/dist/cjs/api-client.js:46:26
at async handler (/home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/openapi-typescript-fetch/dist/cjs/fetcher.js:156:16)
at async /home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/js-client-rest/dist/cjs/api-client.js:32:24
at async handler (/home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/openapi-typescript-fetch/dist/cjs/fetcher.js:156:16)
at async fetchUrl (/home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/openapi-typescript-fetch/dist/cjs/fetcher.js:162:22)
at async Object.fun [as upsertPoints] (/home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/openapi-typescript-fetch/dist/cjs/fetcher.js:168:20)
at async QdrantClient.upsert (/home/torrin/Repos/Equator/ai-app/node_modules/@qdrant/js-client-rest/dist/cjs/qdrant-client.js:553:26)
at (backend/module/ai/Vector.js:132:22)
at (backend/module/ai/Vector.js:178:12)
at (backend/module/util/Upload.js:44:21)
at (backend/module/account/Login.js:75:13) {
[cause]: ConnectTimeoutError: Connect Timeout Error
at onConnectTimeout (/home/torrin/Repos/Equator/ai-app/node_modules/undici/lib/core/connect.js:186:24)
at /home/torrin/Repos/Equator/ai-app/node_modules/undici/lib/core/connect.js:133:46
at Immediate._onImmediate (/home/torrin/Repos/Equator/ai-app/node_modules/undici/lib/core/connect.js:174:9)
at process.processImmediate (node:internal/timers:478:21) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}The code I'm using looks something like this:
const create = async ({ text, fileID, user }) => {
text = text.replace(/\s+/g, ' ').trim();
if (!text) throw new Error('File does not contain valid text.');
const { embeddings, chunks } = await getTextEmbeddings({ text, user });
const points = embeddings.map((embedding, i) => ({
id: uuidv4(),
vector: embedding,
payload: {
file_id: fileID,
chunk_id: i,
text: chunks[i],
creator: user
}
}));
const batchPoints = (points, maxSize) => {
const batches = [];
let currentBatch = [];
let currentSize = 0;
for (const point of points) {
const pointSize = Buffer.byteLength(JSON.stringify(point), 'utf8'); // Estimate size
if (currentSize + pointSize > maxSize && currentBatch.length > 0) {
batches.push(currentBatch);
currentBatch = [];
currentSize = 0;
}
currentBatch.push(point);
currentSize += pointSize;
}
if (currentBatch.length > 0) batches.push(currentBatch);
return batches;
};
const batches = batchPoints(points, MAX_PAYLOAD_SIZE);
try {
for (const batch of batches) {
await qdrantClient.upsert(COLLECTION_NAME, { points: batch });
}
} catch (e) {
console.error(e);
}
};Everything up until the upsert call works as expected, chunking and embedding the incoming text. The error happens a large number of requests take place to qdrant. Which seems odd since it's not a crazy number of requests imo: 38 files with a combined size of 300mb.
If this is a fundamental limit with qdrant then I'm a bit concerned, and might have to resort to a queue system, my hope is there is something very wrong with my setup.
Here is the config file, copied it mostly from the docs:
log_level: INFO
storage:
# Path to store all the data
storage_path: /qdrant/storage
# Where to store snapshots
snapshots_path: /qdrant/storage/snapshots
snapshots_config:
# "local" or "s3" - where to store snapshots
snapshots_storage: local
# Where to store temporary files
# If null, temporary snapshot are stored in: storage/snapshots_temp/
temp_path: null
# If true - point's payload will not be stored in memory.
# It will be read from the disk every time it is requested.
on_disk_payload: true
# Maximum number of concurrent updates to shard replicas
update_concurrency: null
# Write-ahead-log related configuration
wal:
wal_capacity_mb: 32
wal_segments_ahead: 0
node_type: "Normal"
performance:
max_search_threads: 0
max_optimization_threads: 0
optimizer_cpu_budget: 0
update_rate_limit: null
optimizers:
deleted_threshold: 0.2
vacuum_min_vector_number: 1000
default_segment_number: 0
max_segment_size_kb: null
memmap_threshold_kb: 1000
indexing_threshold_kb: 20000
flush_interval_sec: 5
max_optimization_threads: null
hnsw_index:
m: 16
ef_construct: 100
full_scan_threshold_kb: 10000
max_indexing_threads: 0
on_disk: true
service:
max_request_size_mb: 32
max_workers: 0
host: 0.0.0.0
http_port: 6333
grpc_port: 6334
enable_cors: true
enable_tls: false
telemetry_disabled: false