Skip to content

enhance: unify stats file path model with caller-provided basePath#48548

Open
congqixia wants to merge 5 commits intomilvus-io:masterfrom
congqixia:enhance/loon/support_stats_basepath
Open

enhance: unify stats file path model with caller-provided basePath#48548
congqixia wants to merge 5 commits intomilvus-io:masterfrom
congqixia:enhance/loon/support_stats_basepath

Conversation

@congqixia
Copy link
Copy Markdown
Contributor

Move stats file (text_index, json_key_stats) path computation from C++ to Go. Go computes basePath based on storage version (V2: traditional top-level directories, V3: segment basePath/_stats/) and passes it to C++ via proto. C++ uses the basePath directly with no V2/V3 branching.

Key changes:

  • Add stats_base_path to BuildIndexInfo proto (write path)
  • Add base_path to LoadTextIndexInfo/LoadJsonKeyIndexInfo proto (read path)
  • Add base_path to segcorepb TextIndexStats/JsonKeyStats (SegmentLoadInfo path)
  • TextMatchIndex::Upload() returns relative paths (consistent with JsonKeyStats)
  • JsonKeyStats::Load()/TextMatchIndex::Load() require basePath (Assert-checked)
  • Extract BuildTextIndexPrefix()/BuildJsonKeyStatsPrefix() to metautil to eliminate 5 duplicated format strings
  • ConvertToSegcoreSegmentLoadInfo() resolves V3 manifest stats with basePaths

issue: #48547

@sre-ci-robot sre-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines. label Mar 26, 2026
@sre-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: congqixia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify bot added dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement labels Mar 26, 2026
@sre-ci-robot
Copy link
Copy Markdown
Contributor

[ci-v2-notice]
Notice: New ci-v2 system is enabled for this PR.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-build-all // for ci-v2/build-all (multi-arch builds)
  • /ci-rerun-buildenv // for ci-v2/build-env (build milvus-env builder images)
  • /ci-rerun-ut-integration // for ci-v2/ut-integration, will rerun ci-v2/build
  • /ci-rerun-ut-go // for ci-v2/ut-go, will rerun ci-v2/build
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp, will rerun ci-v2/build
  • /ci-rerun-e2e-arm // for ci-v2/e2e-arm
  • /ci-rerun-e2e-default // for ci-v2/e2e-default
  • /ci-rerun-ciloop // for ci-v2/ciloop (build + unit tests in one pipeline)
  • /ci-rerun-gosdk // for ci-v2/go-sdk (Go SDK E2E tests, AMD)
  • /ci-rerun-gosdk-arm // for ci-v2/go-sdk-arm (Go SDK E2E tests, ARM)

If you have any questions or requests, please contact @zhikunyao.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 34.95441% with 214 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.59%. Comparing base (859bf05) to head (452ae0d).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
internal/core/src/indexbuilder/index_c.cpp 2.85% 68 Missing ⚠️
internal/datanode/index/task_stats.go 2.22% 44 Missing ⚠️
internal/querynodev2/segments/segment_loader.go 28.57% 31 Missing and 4 partials ⚠️
internal/storagev2/packed/stats_resolver.go 51.11% 22 Missing ⚠️
internal/core/src/index/TextMatchIndex.cpp 0.00% 16 Missing ⚠️
internal/core/src/storage/FileManager.h 0.00% 6 Missing ⚠️
internal/util/segcore/segment.go 84.21% 4 Missing and 2 partials ⚠️
pkg/util/metautil/binlog.go 57.14% 6 Missing ⚠️
...rnal/core/src/segcore/ChunkedSegmentSealedImpl.cpp 0.00% 2 Missing ⚠️
internal/core/src/segcore/segment_c.cpp 0.00% 2 Missing ⚠️
... and 5 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #48548      +/-   ##
==========================================
- Coverage   77.61%   77.59%   -0.02%     
==========================================
  Files        2112     2112              
  Lines      351804   352049     +245     
==========================================
+ Hits       273038   273180     +142     
- Misses      70415    70506      +91     
- Partials     8351     8363      +12     
Components Coverage Δ
Client 79.25% <ø> (ø)
Core 83.92% <27.81%> (-0.08%) ⬇️
Go 75.79% <39.79%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
...re/src/exec/expression/JsonContainsByStatsTest.cpp 100.00% <100.00%> (ø)
...nternal/core/src/index/json_stats/JsonKeyStats.cpp 88.43% <100.00%> (+0.31%) ⬆️
internal/core/src/index/json_stats/JsonKeyStats.h 64.17% <ø> (ø)
...ternal/core/src/index/json_stats/bson_inverted.cpp 86.08% <100.00%> (-0.91%) ⬇️
...toragev1translator/BsonInvertedIndexTranslator.cpp 95.23% <ø> (ø)
internal/core/src/storage/DiskFileManagerImpl.h 75.00% <ø> (ø)
internal/core/src/storage/MemFileManagerImpl.cpp 44.31% <100.00%> (+0.21%) ⬆️
...e/unittest/test_json_stats/test_json_key_stats.cpp 99.28% <100.00%> (+0.01%) ⬆️
internal/core/src/segcore/SegmentLoadInfo.cpp 68.01% <50.00%> (-0.11%) ⬇️
internal/core/src/storage/DiskFileManagerImpl.cpp 71.53% <80.00%> (-0.16%) ⬇️
... and 13 more

... and 37 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sre-ci-robot sre-ci-robot added the low-code-coverage add test-label from zhikun, diff coverage > 80% label Mar 26, 2026
@congqixia congqixia force-pushed the enhance/loon/support_stats_basepath branch from 8a32fdc to 559ef4a Compare March 27, 2026 04:31
Move stats file (text_index, json_key_stats) path computation from C++
to Go. Go computes basePath based on storage version (V2: traditional
top-level directories, V3: segment basePath/_stats/) and passes it to
C++ via proto. C++ uses the basePath directly with no V2/V3 branching.

Key changes:
- Add stats_base_path to BuildIndexInfo proto (write path)
- Add base_path to LoadTextIndexInfo/LoadJsonKeyIndexInfo proto (read path)
- Add base_path to segcorepb TextIndexStats/JsonKeyStats (SegmentLoadInfo path)
- TextMatchIndex::Upload() returns relative paths (consistent with JsonKeyStats)
- JsonKeyStats::Load()/TextMatchIndex::Load() require basePath (Assert-checked)
- Extract BuildTextIndexPrefix()/BuildJsonKeyStatsPrefix() to metautil
  to eliminate 5 duplicated format strings
- ConvertToSegcoreSegmentLoadInfo() resolves V3 manifest stats with basePaths

issue: milvus-io#48547

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
@congqixia congqixia force-pushed the enhance/loon/support_stats_basepath branch from e8e38dd to 12d9ae8 Compare March 27, 2026 11:03
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement low-code-coverage add test-label from zhikun, diff coverage > 80% size/XXL Denotes a PR that changes 1000+ lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants