Milvus Indexer.Store 每批插入强制 Flush，导致性能严重受限

目前 components/indexer/milvus/indexer.go 的 Indexer.Store 每插入一批数据，就会强制调用一次 Flush，直接影响批量写入场景下的整体性能：

```go
// flush collection to make sure the data is visible
if err := i.config.Client.Flush(ctx, i.config.Collection, false); err != nil {
    return nil, fmt.Errorf("[Indexer.Store] failed to flush collection: %w", err)
}
```

#### 问题影响
- **同步性能瓶颈**：Milvus 的 Flush 是全局同步重操作，会强制将内存数据刷到存储，生成新 segment，仅需要少量批量导入即可完成一次，而全量批量插入场景却每批都全库 Flush，导致并发请求全部被串行阻塞在 Flush 上，性能极差。
- **场景不适配**：大多数使用场景（比如批量向量导入/训练数据写入）并不要求每批插入后立即可查，只需要最终统一 Flush 一次即可。

#### 期望行为
- 支持分批插入/or并发导入，最后统一 Flush，一致性可接受时延后可见（不需要每批都 Flush）。
- Store 方法建议增加批量和并发配置，每批插入只做写入，跳过 Flush，任务结束后统一批量执行一次 Flush。

#### 优化建议
参考如下代码思路：
- 提供 BatchSize、MaxConcurrency 配置项
- 插入时分批 Embed、Convert、InsertRows，每批使用 goroutine 并发控制
- 所有批次插入完成后，再统一 Flush 一次

```go
// 批量插入伪代码
for batch in batches {
    go embed, convert, insertRows(batch)
}
wait for all batches
Flush(collection)
```

这样可以极大提升批量写入效率，避免 Flush 阻塞并发流。

#### 相关讨论与 PR
有 PR #233 支持分区，但没有解决该性能瓶颈。


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Milvus Indexer.Store 每批插入强制 Flush，导致性能严重受限 #579

问题影响

期望行为

优化建议

相关讨论与 PR

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Milvus Indexer.Store 每批插入强制 Flush，导致性能严重受限 #579

Description

问题影响

期望行为

优化建议

相关讨论与 PR

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions