Apart from Pinecone, which vector store DBs allow filtering docs on Metadata before doing Vector search #9714
Replies: 3 comments
-
🤖 Hello, Yes, there are other vector databases that allow filtering documents based on metadata before conducting a vector search. Specifically, the ElasticsearchStore in LangChain provides this feature. This is evident from the Here is a brief example of how you might use this feature: # Initialize ElasticsearchStore
store = ElasticsearchStore(index_name="my_index")
# Conduct a similarity search with a filter
results = store.similarity_search(query="my query", filter=[{"term": {"metadata.my_field": "my_value"}}]) In this example, the Regarding your second question, the ElasticsearchStore in LangChain does support assigning different IDs to various sets of PDF files when saving them in the VectorDB. This is evident from the code snippet where the "_id" field is set to the corresponding id from the ids list while indexing the documents. Here is a brief example of how you might use this feature: # Initialize ElasticsearchStore
store = ElasticsearchStore(index_name="my_index")
# Index a document with a specific ID
store.index(texts=["my text"], vectors=[my_vector], metadata=[my_metadata], ids=["my_id"]) In this example, the I hope this helps! If you have any further questions, please don't hesitate to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
I believe chroma supports this. I am currently doing filtering on documents with metadata using chromadb |
Beta Was this translation helpful? Give feedback.
-
Is it better to search by id first on the postgres rows and then do the metadata filters? How did you achieve this? Basically I don't need to go over every single row in postgres if we already know the ids no? cc @KylePancamo @mohit19906 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have numerous documents that I wish to utilize with Langchain. My specific need is to avoid vector search on all the documents for every question. Instead, when saving the PDF files in the VectorDB, I'd like to assign different IDs to various sets of PDF files, which I think is feasible using Metadata.
For retrieval, I would prefer that the vector search only be conducted on a specific group of PDF files linked with a certain ID. I'm aware that Pinecone provides a feature to filter docs using metadata prior to conducting a vector search. Aside from Pinecone, are there other Vector DBs offering this feature?
Beta Was this translation helpful? Give feedback.
All reactions