Skip to content

Comments

fix: support non-float vectors in struct array#3268

Merged
sre-ci-robot merged 1 commit intomilvus-io:masterfrom
SpadeA-Tang:support-other-types
Feb 9, 2026
Merged

fix: support non-float vectors in struct array#3268
sre-ci-robot merged 1 commit intomilvus-io:masterfrom
SpadeA-Tang:support-other-types

Conversation

@SpadeA-Tang
Copy link
Contributor

issue: #3265

@codecov
Copy link

codecov bot commented Feb 9, 2026

Codecov Report

❌ Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.25%. Comparing base (4297335) to head (60d6dd8).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
pymilvus/client/entity_helper.py 96.55% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3268      +/-   ##
==========================================
+ Coverage   76.21%   76.25%   +0.04%     
==========================================
  Files          63       63              
  Lines       13292    13321      +29     
==========================================
+ Hits        10130    10158      +28     
- Misses       3162     3163       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
@mergify mergify bot added the ci-passed label Feb 9, 2026
@SpadeA-Tang
Copy link
Contributor Author

@XuanYang-cn PTAL

@XuanYang-cn XuanYang-cn added the PR | need to cherry-pick to 2.x This PR need to be cherry-picked to 2.x branch label Feb 9, 2026
Copy link
Contributor

@XuanYang-cn XuanYang-cn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SpadeA-Tang, XuanYang-cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit eb7868f into milvus-io:master Feb 9, 2026
13 checks passed
@zhuwenxing
Copy link
Contributor

Search path issue for non-float vectors in struct array

Insert path works great with this PR! However, the search path still has issues for non-float vectors in struct arrays. Here's the analysis:

1. element_filter (single vector) search — Already supported ✅

The existing _prepare_placeholder_str in prepare.py already handles non-float vectors correctly for element_filter search:

  • Float16/BFloat16/Int8: data=[np.array(dtype)]isinstance(data[0], np.ndarray) → correct PlaceholderType
  • Binary: data=[bytes]isinstance(data[0], bytes)PlaceholderType.BinaryVector

Server-side also works — verified with float16 ndarray + element_filter + L2 metric.

2. EmbeddingList (multi-vector) search — Two fixes needed ❌

Issue A: EmbeddingList.add() doesn't handle bytes input

In embedding_list.py line 125-129:

def add(self, embedding):
    embedding = np.asarray(embedding)  # bytes → 0-D ndarray (shape=())
    if embedding.ndim != 1:            # 0 != 1 → ValueError
        raise ValueError(f"Embedding must be 1D, got shape {embedding.shape}")

For non-float vectors (Float16, BFloat16, Int8, Binary), the entity helper stores vectors as bytes. When constructing an EmbeddingList with these vectors, np.asarray(bytes) produces a 0-D ndarray, causing the dimension check to fail.

Fix: EmbeddingList.add() should handle bytes input by converting to numpy array with the appropriate dtype (e.g., np.frombuffer(embedding, dtype=self._dtype)).

Issue B: _prepare_placeholder_str missing binary EmbeddingList branch

In prepare.py line 1340-1342:

elif dtype == "byte":
    pl_type = PlaceholderType.BinaryVector  # Missing is_embedding_list check!
    pl_values = data

And line 1348-1350:

elif isinstance(data[0], bytes):
    pl_type = PlaceholderType.BinaryVector  # Missing is_embedding_list check!
    pl_values = data

Both branches don't check is_embedding_list, so PlaceholderType.EmbListBinaryVector (already defined as 300 in types.py) is never used.

3. Server-side — OK ✅

Verified that Milvus server correctly handles non-float vector search when data is properly serialized:

Scenario Result
element_filter + float16 ndarray + L2 metric
MAX_SIM + float16 EmbeddingList (manually constructed) + MAX_SIM_L2
element_filter + MAX_SIM metric (brute force path) ❌ Expected — known design limitation

Test Impact

8 search test cases in struct array element search tests are currently marked as xfail due to these pymilvus issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants