Skip to content

When using the new .add method, unable to provide collection config #595

@bnkc

Description

@bnkc

Here is my example:

from qdrant_client.models import Distance, VectorParams
from qdrant_client import QdrantClient


# Initialize the client
client = QdrantClient(":memory:")  


client.create_collection(
    collection_name="test_collection",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)


# Prepare your documents, metadata, and IDs
docs = [
    "Qdrant has Langchain integrations",
    "Qdrant also has Llama Index integrations",
] * 5
metadata = [
    {"source": "Langchain-docs"},
    {"source": "Linkedin-docs"},
] * 5


client.add(collection_name="test_collection", documents=docs, metadata=metadata)

If I try to create_collection before applying the client.add, I get the following traceback:

Traceback (most recent call last):
  File "/Users/lev/Developer/poetry-demo/poetry_demo/qdrant.py", line 40, in <module>
    client.add(collection_name="test_collection", documents=docs, metadata=metadata)
  File "/Users/lev/.pyenv/versions/3.11.1/envs/testing-venv/lib/python3.11/site-packages/qdrant_client/qdrant_fastembed.py", line 496, in add
    self._validate_collection_info(collection_info)
  File "/Users/lev/.pyenv/versions/3.11.1/envs/testing-venv/lib/python3.11/site-packages/qdrant_client/qdrant_fastembed.py", line 348, in _validate_collection_info
    assert isinstance(
AssertionError: Collection have incompatible vector params: size=384 distance=<Distance.COSINE: 'Cosine'> hnsw_config=None quantization_config=None on_disk=None

This is an issue because if I want to use the batch_search method later on rather than the batch_query method, I can't because of a mismatching config. The reason I would rather use the batch_search is because I have already generated embedding for a large volume of text. It makes no sense to regenerate upon query. I would rather query back the original embedding and provide as a parameter to the one of the search methods:

search_result = client.search(
    collection_name="demo_collection", query_vector=[0.1, 0.2, 0.3, 0.4]
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions