Skip to content

Conversation

@chengjoey
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

❌ Patch coverage is 85.56213% with 122 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.63076%. Comparing base (86472bd) to head (994cf61).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
knowledge/vectorstore/milvus/milvus.go 83.28025% 58 Missing and 47 partials ⚠️
...nowledge/vectorstore/milvus/condition_converter.go 86.11111% 9 Missing and 6 partials ⚠️
storage/milvus/milvus.go 96.00000% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@                 Coverage Diff                 @@
##                main        #744         +/-   ##
===================================================
- Coverage   87.65179%   87.63076%   -0.02104%     
===================================================
  Files            281         285          +4     
  Lines          36564       37472        +908     
===================================================
+ Hits           32049       32837        +788     
- Misses          2934        3001         +67     
- Partials        1581        1634         +53     
Flag Coverage Δ
unittests 87.63076% <85.56213%> (-0.02104%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


func (vs *VectorStore) queryMetadataBatch(ctx context.Context, limit, offset int, ids []string, filter map[string]any) (map[string]vectorstore.DocumentMetadata, error) {
filterExpr := ""
if len(ids) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个条件是不是用 and 的关系也可以,现在是互斥的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, 可以用and,改成用searchfilter重新构建语句了

annReqs = append(annReqs, annReq)
}

hybridOption := client.NewHybridSearchOption(vs.option.collectionName, vs.getMaxResults(query.Limit), annReqs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

milvus 默认使用的RRFReranker ,这里返回的score会归一化吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看了下milvus请求的构建和文档,如果没有指定的话,应该没有使用RRFreranker
我另外加了reranker option,可以自定义配置

@chengjoey chengjoey force-pushed the feat/storage-milvus branch 2 times, most recently from 5e08ae2 to 196c8b0 Compare November 26, 2025 15:38
return "", nil
}

switch condition.Operator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的and 和 or条件构建函数是不是可以合并成一个函数, 以及 下面的 ==、!=、 >、>= 、< 、<= 这些也可以合并成一个函数。 这里的代码重复率太高了。可以参考:tcvector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// milvusFilterConverter converts searchfilter conditions to Milvus expressions
type milvusFilterConverter struct{}

func (c *milvusFilterConverter) Convert(condition *searchfilter.UniversalFilterCondition) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里最好加一个recover 处理一下可能的panic。可以参考 tcvector的实现

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return c.convertInCondition(condition)
case searchfilter.OperatorNotIn:
return c.convertNotInCondition(condition)
case searchfilter.OperatorLike:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是少了一个not like 的实现

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好像也还少了一个 searchfilter.OperatorBetween

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,还有between

Copy link
Contributor

@nomand-zc nomand-zc Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要实现一下docBuilder, 方便支持用户自定义存储结构的转换处理。参考

Copy link
Contributor

@nomand-zc nomand-zc Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里其实已经实现了,不需要docBuilder,在创建存储collection和查询的时候都已经支持自定义存储结构

这里是指用户不是通过框架来创建的collection, 自己提前创建非标准的collection (这里主要是业务可能有自己独立的离线知识库处理流程,只用knowledge的retriever能力)。 所以需要一个WithDocBuilder, 让用户把非标的document存储结构转换成标准结构
可以看下docBuilder使用例子

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

理解了,加上docBuilder了,单元测试里也加上了

)

// Client is the interface for the milvus client.
type Client interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里看着都是按照v2的返回值定义的interface。 会有和v1有兼容性问题吗? 或者将来出v3是不是有可能接口兼容不了呢? 这里是不是应该定义和版本无关的接口。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里没有啥v1,之前的milvus-go-sdk已经废弃了,以后都会在milvus仓库维护了
这里主要是为了实现mock client,和milvus server本身没有啥兼容性问题

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里用过滤器模版是不是会好点

conditions = append(conditions, fmt.Sprintf("%s in [%s]", vs.option.idField, strings.Join(idFilters, ", ")))
}

if len(filter.Metadata) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter.Metadata 以及 filter.IDs的条件构建是不是可以统一转换到filter.FilterCondition中,交给Converter统一来处理

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return maxResults
}

func (vs *VectorStore) buildMetadataFilter(filter map[string]any) (string, error) {
Copy link
Contributor

@nomand-zc nomand-zc Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的格式转换建议统一放到Converter中来处理。filter.FilterCondition中可能也会存在metadata.xx的条件

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经没有buildMetadataFilter了,全部交由Converter来处理


func (c *milvusFilterConverter) convertCondition(condition *searchfilter.UniversalFilterCondition) (string, error) {
if condition == nil {
return "", nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

考虑与其他db的对齐一下,return error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


func (c *milvusFilterConverter) convertAndCondition(condition *searchfilter.UniversalFilterCondition) (string, error) {
if condition.Value == nil {
return "", nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不符合预期的条件或者value,建议都返回error。尽量别把error隐藏了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}()

if condition == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个条件判断貌似可以去掉, convertCondition方法中已经做了统一判断

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是搞反了,应该保留convertCondition中的判断, 去掉Convert中的判断。 因为AND OR逻辑条件时会递归调用convertCondition,所以这里统一在convertCondition做判断处理会完整一点

// - String values must be enclosed in quotes
// - Numeric values (int, float) should not be quoted
// - Boolean values should not be quoted
func (c *milvusFilterConverter) formatValue(value interface{}) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Time类型需要做特殊处理吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Time类型现在当int64来处理了

// - String values must be enclosed in quotes
// - Numeric values (int, float) should not be quoted
// - Boolean values should not be quoted
func (c *milvusFilterConverter) formatValue(value interface{}) string {
Copy link
Contributor

@nomand-zc nomand-zc Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

milvus文档 这里使用过滤器模版是不是在性能和value兼容性上会好点?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对的,用过滤器模版会好点,这个我改下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete函数不支持filter_params,所以delete还是用原始语句

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete函数不支持filter_params,所以delete还是用原始语句

Clipboard_Screenshot_1764316369 看文档是可以支持的? 版本的问题?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, 删除操作目前只支持根据id来删, 还好。

return nil, fmt.Errorf("create milvus client from instance failed: %w", err)
}
} else {
if option.address == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议这里只构建builderOpts,client的创建逻辑统一由ClientBuilder来处理。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// Update document using upsert
upsertOption := client.NewColumnBasedInsertOption(vs.option.collectionName).WithVarcharColumn(vs.option.idField, []string{doc.ID}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

覆盖模式下的更新需要带上全部字段吧,貌似只有开启了nullable的字段才可以省略?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image 是的,将createdat设置为nullable了

for key, val := range cr.params {
queryOption.WithTemplateParam(key, val)
}
queryOption.WithOutputFields(vs.option.allFields...)
Copy link
Contributor

@nomand-zc nomand-zc Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个WithOutputFields是必须要设置的吗?有没有默认输出所有字段的设置,不然这里可能会有一个问题:如果是业务自定义的collection,非标字段是不是都拿不到了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, WithOutputFields([]string{""}...) “”代表所有字段

Copy link
Contributor Author

@chengjoey chengjoey Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* 代表所有字段

}
queryOption.WithOutputFields([]string{
vs.option.idField,
vs.option.metadataField,
Copy link
Contributor

@nomand-zc nomand-zc Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也是一样,自定义collection不一定有metadata字段。 有可能是需要在自定义docBuilder中构建的metadata

if vs.option.docBuilder != nil {
for i := 0; i < resultLen; i++ {
row := make([]column.Column, len(vs.option.allFields))
for j, field := range vs.option.allFields {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也需要改一下,按照vs.option.allFields来取字段不合适了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, 直接以milvus server返回的字段为准

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants