work around json or grpc limit on upsert with large mutlivector #7213

contentnation · 2025-09-04T13:27:39Z

contentnation
Sep 4, 2025

Hi everyone,
my current setup:
I have documents that are embedded with sentence transformers.
Some of these document are very long, 2279 sentences is one them where I ran into the issue.
The current setup is using multivectors created with.

 self.client.create_collection(
     collection_name=self.index,
     vectors_config=VectorParams(
        size=1024,
        distance=Distance.COSINE,
        multivector_config=MultiVectorConfig(
           comparator=MultiVectorComparator.MAX_SIM
         ),
     ),
 )

I'm using upsert like this:

r = client.upsert(
    collection_name = self.index,
    wait = True,
    points = [
        PointStruct(
            id = other_db_id,
            vector = vectors,
        )
    ],
)

I also tried upload_points().
With above example vectors is a list of 2279 vectors with 1024 floats each.
This is too much for the API interface (JSON and gRPC).
Eror with gRPC is: Validation error: Vector size is too large [{\"vectors_count\": Number(2279), \"vector_len\": Number(2333696)}]]
With JSON, I get a json parse error, because data is too large. Same issue, other shrink-wrapping.
I have not found a way to split it into 2 updates and qdrant should merge those (on the given id) on the server.

Would it be better to move away from multivector and have the vectors independently with my external document id in the payload and the qdrant id auto generated?
Each of the vectors with payload={eid: otherdbid}
This would blow up the number of points and the payload, but the total vector amount would stay the same.

The final scale should be 2-3 digit millions of documents with dozens to thousands of vectors per external document. So potentially a couple billion vectors.
Search is always a: give me list of externalids that are cosine close to given vector [...].
I'm also open for settings to optimize the speed. RAM is 64 - 128GB, modern 12/24 core CPU or similar.
Search queries should be in the <10 second range, not minutes, so if clustering is needed, so be it.
Ingesting speed should be fast, but that is less critical. Pushing billions of vectors should not take weeks ;)

Thanks

Grumpy

Answered by timvisee

Sep 5, 2025

Would it be better to move away from multivector and have the vectors independently with my external document id in the payload and the qdrant id auto generated?

Yes, that's what we normally recommend.

Multivectors also have limited scalability in terms of indexing performance. Such large multi vectors will definitely become a problem, which is why we currently have a hard cap of 1MB.

This would blow up the number of points and the payload, but the total vector amount would stay the same.

Qdrant is built with this kind of scale in mind. Please create an appropriate payload index to still allow performant searches.

Vector data is large, so when you're talking about billions of vectors …

View full answer

timvisee · 2025-09-05T09:06:45Z

timvisee
Sep 5, 2025
Maintainer

Would it be better to move away from multivector and have the vectors independently with my external document id in the payload and the qdrant id auto generated?

Yes, that's what we normally recommend.

Multivectors also have limited scalability in terms of indexing performance. Such large multi vectors will definitely become a problem, which is why we currently have a hard cap of 1MB.

This would blow up the number of points and the payload, but the total vector amount would stay the same.

Qdrant is built with this kind of scale in mind. Please create an appropriate payload index to still allow performant searches.

Vector data is large, so when you're talking about billions of vectors you'll likely need large machines and a number of nodes. It is hard for me to give a proper estimate here. I'd recommend to do a rough calculation and do a benchmark of your own to see if you meet your requirements.

At this scale you're likely looking for binary quantization, assuming it achieves good enough accuracy with your dataset.

4 replies

contentnation Sep 5, 2025
Author

I converted it to the mentioned approach and was able to import my test data (about 9000 documents, 3.2 million vectors) and it started to become more and more difficult because of timeouts. Inserts became noticeable slower and searches a lot slower and a couple of timeout. But it general it works.
Looks like I need to tune it, but my question is answered.
Thanks.

timvisee Sep 8, 2025
Maintainer

In terms of optimization, this may be helpful. Please also read our documentation on distributed deployment if scaling this across machines.

I converted it to the mentioned approach and was able to import my test data (about 9000 documents, 3.2 million vectors) and it started to become more and more difficult because of timeouts.

Ingesting speed is heavily dependent on the current collection configuration and on used hardware. In case of bulk upload you likely need to (temporarily) disable indexing, please read through our recommendations here. After enabling indexing again - search performance will eventually settle once all newly ingested data is indexed.

contentnation Sep 8, 2025
Author

Hi Tim,
thanks for your support.
I don't use an index on the payload, it won't be used in any search, just an info in the result that is used in the business logic.

Disabling indexing on the points/vector is probably a bad idea (if possible) in my use case, I insert a few hundred (or thousand) points (one document) and directly do a search to find other documents that are similar to the new one.

It would not matter if the new document is ignored, but I think a disabled index would ruin the search. I tried to change some collections settings via a PATCH curl call (optimizer_config->default_segment_number/indexing_threshold), but it does seem to be ignored. The UI still shows the old values. Do I have to trigger this somehow to become active?
I recreated the collection with some changed settings and currently importing/cross-referencing the data to see if these settings are better. Will also try quantization on the run after that if that is good enough.
Again, thanks for the support.

timvisee Oct 15, 2025
Maintainer

It would not matter if the new document is ignored, but I think a disabled index would ruin the search

That is a good call. We have a feature for that! You can set indexed_only=true (docs) on your search. In some use cases this has a huge effect.

Disabling indexing on the points/vector is probably a bad idea (if possible) in my use case, I insert a few hundred (or thousand) points (one document) and directly do a search to find other documents that are similar to the new one.

It'll only disable building an index for new points. The existing indices are kept and still used during search. However, using the above parameter is likely a better approach.

I tried to change some collections settings via a PATCH curl call (optimizer_config->default_segment_number/indexing_threshold), but it does seem to be ignored.

It should. If it doesn't change, it might point to other problems. Are you sure your request is properly formatted?

Could you share the exact request you're sending, and the current collection configuration (GET /collections/my_collection)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

work around json or grpc limit on upsert with large mutlivector #7213

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Qdrant

work around json or grpc limit on upsert with large mutlivector #7213

Uh oh!

contentnation Sep 4, 2025

Replies: 1 comment · 4 replies

Uh oh!

Uh oh!

timvisee Sep 5, 2025 Maintainer

Uh oh!

contentnation Sep 5, 2025 Author

Uh oh!

timvisee Sep 8, 2025 Maintainer

Uh oh!

contentnation Sep 8, 2025 Author

Uh oh!

Uh oh!

timvisee Oct 15, 2025 Maintainer

contentnation
Sep 4, 2025

Replies: 1 comment 4 replies

timvisee
Sep 5, 2025
Maintainer

contentnation Sep 5, 2025
Author

timvisee Sep 8, 2025
Maintainer

contentnation Sep 8, 2025
Author

timvisee Oct 15, 2025
Maintainer