Skip to content

Conversation

@onkar717
Copy link
Contributor

@onkar717 onkar717 commented Jan 18, 2026

Add Optional Time Range Fields in manifest.json for Secondary Index

This PR adds optional min_timestamp and max_timestamp fields to the manifest.json file for secondary index parts. These fields are used to improve part selection during queries when time range filtering is required.

Key Changes:

  • Added MinTimestamp and MaxTimestamp optional fields to partMetadata (manifest.json)

  • Refactored WriteRequest to explicitly include Timestamp field as suggested

  • Updated merge logic to compute and persist timestamp ranges after part merging

  • Enhanced part selection algorithm to use timestamp ranges for query optimization

  • Tests(including UT, IT, E2E) are added to verify the new feature.

  • If this pull request closes/resolves/fixes an existing issue, replace the issue number. Fixes apache/skywalking-banyandb#13620.

  • Update the CHANGES log.

  • Issue - [Feature] Add Optional Time Range in manifest.json for Secondary Index skywalking#13620

@wu-sheng wu-sheng requested review from Copilot and hanahmily January 18, 2026 22:50
@wu-sheng wu-sheng added the enhancement New feature or request label Jan 18, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds optional timestamp range tracking (min_timestamp and max_timestamp) to the manifest.json file for secondary index parts, enabling query optimization through time-based part selection. The implementation maintains backward compatibility by treating timestamps as optional fields.

Changes:

  • Added MinTimestamp and MaxTimestamp optional fields to partMetadata with JSON serialization support using omitempty
  • Extended WriteRequest to include an explicit Timestamp field for tracking time ranges during writes
  • Updated merge logic to recompute timestamp ranges from source parts after merging
  • Enhanced part selection algorithms in both Query and ScanQuery to filter parts based on time range overlap

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
banyand/internal/sidx/metadata.go Added optional MinTimestamp and MaxTimestamp fields to partMetadata with proper reset handling
banyand/internal/sidx/interfaces.go Added Timestamp field to WriteRequest and MinTimestamp/MaxTimestamp to QueryRequest and ScanQueryRequest
banyand/internal/sidx/element.go Added timestamps slice to elements structure with proper initialization and swap operations
banyand/internal/sidx/block_writer.go Implemented timestamp tracking during block writes with min/max computation
banyand/internal/sidx/part.go Updated part initialization to handle timestamps with backward compatibility fallback
banyand/internal/sidx/part_wrapper.go Added overlapsTimeRange method for efficient time-based filtering
banyand/internal/sidx/merge.go Added recomputeTimestampRanges function to preserve timestamp metadata during merges
banyand/internal/sidx/query.go Updated selectPartsForQuery to filter parts by both key and time ranges
banyand/internal/sidx/scan_query.go Added time range filtering before scanning parts
banyand/internal/sidx/sidx.go Updated ConvertToMemPart to include timestamp parameter
banyand/trace/write_standalone.go Implemented timestamp extraction from trace tags for secondary index writes
banyand/internal/sidx/timestamp_feature_test.go Comprehensive test suite for timestamp feature including JSON serialization
banyand/internal/sidx/part_wrapper_test.go Extensive tests for overlapsTimeRange with various edge cases
banyand/internal/sidx/metadata_test.go Tests for timestamp serialization and deserialization
banyand/internal/sidx/merge_test.go Tests for timestamp range recomputation during merges
banyand/internal/sidx/tag_test.go Updated WriteRequest test data to include Timestamp field
banyand/internal/sidx/snapshot_test.go Updated WriteRequest test data to include Timestamp field
banyand/internal/sidx/element_test.go Updated test data to initialize timestamps
banyand/internal/sidx/block_reader_test.go Updated test element creation to include timestamps
banyand/internal/sidx/part_test.go Updated test element creation to include timestamps

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 333 to 355
// Update timestamp ranges
if len(timestamps) > 0 {
for _, ts := range timestamps {
if ts != 0 {
if !bw.hasTimestamp {
bw.totalMinTimestamp = ts
bw.totalMaxTimestamp = ts
bw.hasTimestamp = true
} else {
if ts < bw.totalMinTimestamp {
bw.totalMinTimestamp = ts
}
if ts > bw.totalMaxTimestamp {
bw.totalMaxTimestamp = ts
}
}
}
}
}
Copy link

Copilot AI Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp tracking logic skips timestamps with value 0 (line 336). While this is likely intentional to distinguish "not set" from actual timestamps, consider documenting this behavior in the code or interface documentation. Unix epoch (timestamp 0) is technically a valid timestamp but would be ignored by this implementation. For a time-series database, this is probably acceptable as data from 1970 is unlikely, but it should be documented for clarity.

Copilot uses AI. Check for mistakes.
Tags []Tag
SeriesID common.SeriesID
Key int64
Timestamp int64 // Unix nanoseconds timestamp for time range filtering
Copy link

Copilot AI Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Timestamp field in WriteRequest should have a comment explaining that a value of 0 is treated as "not set" and will be ignored during timestamp range computation. This would help API users understand the expected behavior.

Suggested change
Timestamp int64 // Unix nanoseconds timestamp for time range filtering
Timestamp int64 // Unix nanoseconds timestamp for time range filtering; 0 means "not set" and is ignored during timestamp range computation

Copilot uses AI. Check for mistakes.
@hanahmily
Copy link
Contributor

@onkar717 Would you please to use "make lint" to check and fix your lint issues.

Comment on lines 92 to 93
MinTimestamp *int64 // Unix nanoseconds - for part-level selection only
MaxTimestamp *int64 // Unix nanoseconds - for part-level selection only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not used by anyone.

Comment on lines 114 to 115
MinTimestamp *int64 // Unix nanoseconds - for part-level selection only
MaxTimestamp *int64 // Unix nanoseconds - for part-level selection only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not referenced.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if !pw.overlapsKeyRange(minKey, maxKey) {
continue
}
if !pw.overlapsTimeRange(nil, nil) {
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time range filtering is hardcoded to pass nil, nil which means no timestamp filtering will be applied during scan queries. This defeats the purpose of adding timestamp range support. The ScanQueryRequest should include MinTimestamp and MaxTimestamp fields that can be passed here to enable time-based filtering optimization.

Suggested change
if !pw.overlapsTimeRange(nil, nil) {
if !pw.overlapsTimeRange(req.MinTimestamp, req.MaxTimestamp) {

Copilot uses AI. Check for mistakes.
asc := extractOrdering(req)

parts := selectPartsForQuery(snap, minKey, maxKey)
parts := selectPartsForQuery(snap, minKey, maxKey, nil, nil)
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time range filtering is hardcoded to nil, nil which means timestamp-based part selection optimization is not being utilized. To enable the time-based query optimization that this PR introduces, QueryRequest needs MinTimestamp and MaxTimestamp fields that can be passed to selectPartsForQuery.

Suggested change
parts := selectPartsForQuery(snap, minKey, maxKey, nil, nil)
parts := selectPartsForQuery(snap, minKey, maxKey, req.MinTimestamp, req.MaxTimestamp)

Copilot uses AI. Check for mistakes.
Comment on lines 759 to 760
if len(es.timestamps) >= i {
seriesTimestamps = es.timestamps[blockStart:i]
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition len(es.timestamps) >= i is incorrect. The variable i is used as an exclusive end index for slicing (line 760), so the correct condition should be len(es.timestamps) > blockStart or len(es.timestamps) >= i should be changed to check against blockStart. As written, when timestamps are present, the fallback path will be incorrectly taken in many cases. For example, if there are 10 timestamps (indices 0-9) and i=10, the condition fails even though timestamps[0:10] would be valid.

Suggested change
if len(es.timestamps) >= i {
seriesTimestamps = es.timestamps[blockStart:i]
if len(es.timestamps) > blockStart {
// Use available timestamps starting from blockStart, up to either i or the end of timestamps.
upper := i
if upper > len(es.timestamps) {
upper = len(es.timestamps)
}
seriesTimestamps = es.timestamps[blockStart:upper]
// If we don't have enough timestamps to cover the whole block, pad with zeros
if len(seriesTimestamps) < i-blockStart {
padded := make([]int64, i-blockStart)
copy(padded, seriesTimestamps)
seriesTimestamps = padded
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@hanahmily hanahmily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MaxTimestamp and MinTimestamp are essential and must be utilized by the trace query process. Simply removing them is not the correct way.

Use them at

req := sidx.QueryRequest{

@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 36.86%. Comparing base (3530dd9) to head (2d5329a).
⚠️ Report is 124 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (3530dd9) and HEAD (2d5329a). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (3530dd9) HEAD (2d5329a)
1 0
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #948      +/-   ##
==========================================
- Coverage   45.97%   36.86%   -9.12%     
==========================================
  Files         328      133     -195     
  Lines       55505    14468   -41037     
==========================================
- Hits        25520     5334   -20186     
+ Misses      27909     8720   -19189     
+ Partials     2076      414    -1662     
Flag Coverage Δ
fodc 79.10% <ø> (?)
pkg 29.36% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@onkar717 onkar717 force-pushed the feature/add-optional-time-range-manifest branch from 6a6a996 to 2d5329a Compare January 29, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants