Skip to content

feat: enhance search functionality with ID support#499

Merged
shanghaikid merged 5 commits intomainfrom
feat/search-by-ids
Jan 19, 2026
Merged

feat: enhance search functionality with ID support#499
shanghaikid merged 5 commits intomainfrom
feat/search-by-ids

Conversation

@shanghaikid
Copy link
Contributor

  • Updated SearchReq and SearchSimpleReq interfaces to include optional ids field for searching by primary keys.
  • Modified buildSearchRequest function to handle ID-based searches, including validation for primary key types.
  • Added tests to validate search functionality using IDs, ensuring correct error handling for type mismatches and missing primary keys.

- Updated SearchReq and SearchSimpleReq interfaces to include optional `ids` field for searching by primary keys.
- Modified buildSearchRequest function to handle ID-based searches, including validation for primary key types.
- Added tests to validate search functionality using IDs, ensuring correct error handling for type mismatches and missing primary keys.

Signed-off-by: ryjiang <jiangruiyi@gmail.com>
@sre-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shanghaikid

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link

Summary of Changes

Hello @shanghaikid, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the search capabilities by introducing the ability to perform searches directly using primary keys. This allows users to retrieve specific data points by their unique identifiers, complementing the existing vector-based search methods. The changes ensure data integrity through robust validation of ID types against the collection's primary key schema, making the search API more versatile and user-friendly.

Highlights

  • Enhanced Search Functionality: The search functionality has been extended to allow searching by primary keys (IDs) in addition to vector-based searches. This provides more flexible query options for users.
  • Interface Updates: The SearchReq and SearchSimpleReq interfaces now include an optional ids field, which accepts an array of numbers or strings for primary key-based searches. Correspondingly, the vectors and data fields have been made optional to support ID-only searches.
  • Robust ID Validation: The buildSearchRequest function has been updated to handle ID-based searches, including comprehensive validation for primary key types (Int64 and VarChar). It ensures that the provided IDs match the primary key's data type and throws errors for type mismatches, unsupported primary key types, or missing primary keys when IDs are used.
  • Comprehensive Testing: New test cases have been added to validate the ID-based search functionality, covering successful searches, error handling for invalid ID types, unsupported primary key types, and scenarios where neither IDs nor vector data are provided.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable enhancement by enabling search functionality using primary key IDs. The implementation is mostly solid, with good test coverage for the new logic. However, I've identified a critical bug concerning the handling of Int64 IDs that could lead to runtime errors. Additionally, I've included several suggestions to improve code quality and the developer experience by simplifying logic, reducing redundancy, and refining the API design to make anns_field optional for ID-based searches. Addressing these points will make the new feature more robust and easier to use.

if (ids && ids.length > 0) {
const pkDataType = pkField!.dataType || DataTypeMap[pkField!.data_type];
if (pkDataType === DataType.Int64) {
request.ids = { int_id: { data: ids as number[] } };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a critical type mismatch here. The validation for Int64 primary keys allows ids to be an array of strings representing numbers. However, this line performs a direct cast ids as number[] without converting the string values to numbers. This will cause a runtime error or send incorrect data to Milvus if ids contains strings. You must map the values to numbers before assigning them.

Suggested change
request.ids = { int_id: { data: ids as number[] } };
request.ids = { int_id: { data: (ids as (string | number)[]).map(Number) } };

Comment on lines 264 to 271
let pkField: FieldSchema | undefined;
for (let i = 0; i < collectionInfo.schema.fields.length; i++) {
const f = collectionInfo.schema.fields[i];
if (f.is_primary_key) {
pkField = f;
break;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This for loop to find the primary key field can be made more concise and idiomatic by using the Array.prototype.find() method.

Suggested change
let pkField: FieldSchema | undefined;
for (let i = 0; i < collectionInfo.schema.fields.length; i++) {
const f = collectionInfo.schema.fields[i];
if (f.is_primary_key) {
pkField = f;
break;
}
}
const pkField = collectionInfo.schema.fields.find(f => f.is_primary_key);

};

if (ids && ids.length > 0) {
const pkDataType = pkField!.dataType || DataTypeMap[pkField!.data_type];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The primary key data type (pkDataType) is calculated here and also earlier in the validation logic (line 294). This code duplication can be avoided by calculating it once and reusing the variable. Consider refactoring to define pkDataType a single time.

Signed-off-by: ryjiang <jiangruiyi@gmail.com>
@codecov
Copy link

codecov bot commented Jan 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.29%. Comparing base (07cb4f0) to head (d4dee9a).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #499      +/-   ##
==========================================
+ Coverage   97.19%   97.29%   +0.10%     
==========================================
  Files          52       52              
  Lines        3674     3700      +26     
  Branches      978      996      +18     
==========================================
+ Hits         3571     3600      +29     
+ Misses         98       95       -3     
  Partials        5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: ryjiang <jiangruiyi@gmail.com>
Signed-off-by: ryjiang <jiangruiyi@gmail.com>
@shanghaikid shanghaikid merged commit 9f5bca0 into main Jan 19, 2026
4 of 5 checks passed
@shanghaikid shanghaikid deleted the feat/search-by-ids branch January 19, 2026 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants