Conversation
|
@olirice what do you think? |
|
nice feature... getting vectors back would be great and matches with the behavior of most other vector db libs. right now I have to re-query the point ids again to get the vectors back |
olirice
left a comment
There was a problem hiding this comment.
Could you give me an example of where its important to get the actual vector back?
src/vecs/collection.py
Outdated
| filters: Optional[Dict] = None, | ||
| measure: Union[IndexMeasure, str] = IndexMeasure.cosine_distance, | ||
| include_value: bool = False, | ||
| include_vector: bool = False, |
There was a problem hiding this comment.
new arguments have to go at the end of the args list but before kwargs. Otherwise, the meaning of
bar.query(
data=query_vec,
limit=top_k,
filters=None,
measure="cosine_distance",
True,
True
)
would change
There was a problem hiding this comment.
use case - say we have the financial report of a company, and we vectorize the doc into N chunks. From here, one simple way to construct a vector representation of the company is to take the average of these N chunks, for which we need the query to return the raw vectors themselves.
|
to get the pre-commit hooks to pass you can run then commit the changes it makes and push that back up |
|
Oops, that's my bad. Linted and pushed. |
|
Just bumping this -- I'm pretty sure it's ready to merge :) |
|
Could you please update the corresponding docs for the query function? |
What kind of change does this PR introduce?
Feature: adds a
include_vectorargument to thecollections.query()method. Defaults false, but it'll return the actual vector along with theids.What is the current behavior?
Current behavior is you can't get back the vectors along with the
idsin the response. See the issue I made.What is the new behavior?
You can get back the vector itself.
Additional context
Ideally I'd actually re-order some of the
include_*on the query method. I think the best ordering is:include_vector,include_metadata,include_score.Because that way if you want to get everything back, you still get the (id, vec, metadata) in the same order as the original record, just with
scoreappended. However, I think that'd technically be a breaking change because of re-ordering returns.Idk, I don't feel that strongly about it.