-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version:v2.6.10
- Deployment mode(standalone or cluster):Standalone
- MQ type(rocksmq, pulsar or kafka):rocksmq
- SDK version(e.g. pymilvus v2.0.0rc2):PyMilvus v2.6.10
- OS(Ubuntu or CentOS): Windows (Client) / Linux (Server)
- CPU/Memory:N/A
- GPU:N/A
- Others: Docker Compose deploymentCurrent Behavior
Milvus v2.6.10 accepts filter expressions with descending ranges (e.g., field in [10, 5]), which are semantically incorrect. While the syntax is valid, the expression should be rejected or normalized to ensure consistent behavior.
Expected Behavior
Filter expressions should be validated for semantic correctness:
- Descending ranges: Should be rejected or automatically normalized to ascending order
- Empty ranges: Should be rejected with clear error message
- Single value in IN: Should warn or suggest using equality operator
- Invalid LIKE usage: Should fail when used on non-string fields
- Invalid JSON paths: Should fail or handle gracefully with clear error messages
Example expected error for descending range:
<ParamError: (code=1100, message=Invalid filter expression: range values must be in ascending order, got [10, 5])>
Example expected error for empty range:
<ParamError: (code=1100, message=Invalid filter expression: IN expression cannot have empty range)>
Steps To Reproduce
1. Create a collection with scalar field
2. Insert test data into collection
3. Attempt to search with descending range filter (e.g., `age in [10, 5]`)
4. Observe that search succeeds (should fail or normalize)
**Descending range in IN expression:**
from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType, utility
connections.connect(alias="default", host="localhost", port="19530")
collection_name = "test_filter_descending"
# Create collection with scalar field
if utility.has_collection(collection_name):
utility.drop_collection(collection_name)
fields = [
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),
FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=128),
FieldSchema(name="age", dtype=DataType.INT64)
]
schema = CollectionSchema(fields=fields, description="test")
collection = Collection(name=collection_name, schema=schema)
# Insert test data with scalar field
data = [
[i for i in range(10)],
[[0.1]*128 for _ in range(10)],
[i * 10 for i in range(10)] # age: 0, 10, 20, 30, ..., 90
]
collection.insert(data)
collection.flush()
# Create index and load collection
index_params = {"metric_type": "L2", "index_type": "IVF_FLAT", "params": {"nlist": 100}}
collection.create_index(field_name="vector", index_params=index_params)
collection.load()
# Try to search with descending range (semantically incorrect)
res = collection.search(
data=[[0.1]*128],
anns_field="vector",
param={"metric_type": "L2", "params": {"nprobe": 10}},
limit=10,
expr="age in [10, 5]"
)
# Result: Search succeeds, returns 10 results (should fail or normalize)
# ❌ BUG CONFIRMED: Descending range is accepted
**Empty range in IN expression:**
res = collection.search(
data=[[0.1]*128],
anns_field="vector",
param={"metric_type": "L2", "params": {"nprobe": 10}},
limit=10,
expr="age in []"
)
# Result: Search succeeds, returns 10 results (should fail)
# ❌ BUG CONFIRMED: Empty range is accepted
**Single value in IN expression (should use == instead):**
res = collection.search(
data=[[0.1]*128],
anns_field="vector",
param={"metric_type": "L2", "params": {"nprobe": 10}},
limit=10,
expr="age in [10]"
)
# Result: Search succeeds, returns 1 result (should warn or suggest using ==)
# ⚠️ ACCEPTED: Single-value IN is accepted without warning
**Invalid LIKE usage with non-string field:**
res = collection.search(
data=[[0.1]*128],
anns_field="vector",
param={"metric_type": "L2", "params": {"nprobe": 10}},
limit=10,
expr="age like '10'"
)
# Result: Search fails with error (correctly rejected)
# ✅ FIXED: LIKE on non-string field is rejected
# Error: <MilvusException: (code=1100, message=failed to create query plan: cannot parse expression: age like '10', error: like operation on non-string or no-json field is unsupported: invalid parameter)>
**Valid filter expression (control test):**
res = collection.search(
data=[[0.1]*128],
anns_field="vector",
param={"metric_type": "L2", "params": {"nprobe": 10}},
limit=10,
expr="age in [10, 20, 30]"
)
# Result: Search succeeds, returns 3 results (correct behavior)
# ✅ WORKING: Valid filter expression works correctlyMilvus Log
N/A - Issue is reproducible via SDK behavior observation.
Anything else?
Contract Violation
-
Filter Expression Validation Contract: According to Milvus Boolean Expression Documentation, filter expressions should be validated for semantic correctness.
-
Documentation References:
- Milvus Boolean Expression Documentation - Official documentation for filter expressions
- Milvus supports filtering on scalar fields (INT64, VARCHAR, JSON)
- Filter expressions use operators like
IN,LIKE,==, etc. - Multiple technical articles and community resources confirm filter expression functionality
Impact
- User Confusion: Users may create semantically incorrect filters without realizing
- Unexpected Results: Descending ranges may produce unexpected or inconsistent results
- Poor User Experience: No validation or warnings for common mistakes
Severity
P2 (Medium) - This is a usability issue that affects users creating filter expressions. While it doesn't cause crashes, it allows creation of potentially incorrect filters that may lead to unexpected results.
Suggested Fix
-
Enhance Filter Expression Validation:
- Validate range values are in ascending order
- Reject empty ranges with clear error messages
- Warn for single-value IN expressions
- Validate operator usage matches field type
- Validate JSON paths exist
-
Improve Error Messages:
- Identify specific issue with filter expression
- Provide clear explanation of what's wrong
- Suggest correct usage when possible
-
Normalization (Optional):
- Automatically normalize descending ranges to ascending order
- Log normalization warnings for debugging
Related Issues
- Part of broader filter expression validation issues in Milvus v2.6.x
Verification Results
Verification Date: 2026-02-11
Milvus Version: v2.6.10
| Test Case | Expected Behavior | Actual Behavior | Status |
|---|---|---|---|
Descending range (age in [10, 5]) |
Should be rejected or normalized | Accepted, returns 10 results | ❌ Bug exists |
Empty range (age in []) |
Should be rejected | Accepted, returns 10 results | ❌ Bug exists |
Single value IN (age in [10]) |
Should warn or suggest using == | Accepted without warning | |
| LIKE on non-string field | Should be rejected | Rejected with error | ✅ Fixed |
Valid filter (age in [10, 20, 30]) |
Should succeed | Succeeds, returns 3 results | ✅ Working |
Summary: 2 bugs confirmed (descending range, empty range), 1 usability issue (single-value IN), 1 issue fixed (LIKE validation)