-
Notifications
You must be signed in to change notification settings - Fork 105
Feature/add optional time range manifest #948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/add optional time range manifest #948
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds optional timestamp range tracking (min_timestamp and max_timestamp) to the manifest.json file for secondary index parts, enabling query optimization through time-based part selection. The implementation maintains backward compatibility by treating timestamps as optional fields.
Changes:
- Added
MinTimestampandMaxTimestampoptional fields topartMetadatawith JSON serialization support usingomitempty - Extended
WriteRequestto include an explicitTimestampfield for tracking time ranges during writes - Updated merge logic to recompute timestamp ranges from source parts after merging
- Enhanced part selection algorithms in both
QueryandScanQueryto filter parts based on time range overlap
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| banyand/internal/sidx/metadata.go | Added optional MinTimestamp and MaxTimestamp fields to partMetadata with proper reset handling |
| banyand/internal/sidx/interfaces.go | Added Timestamp field to WriteRequest and MinTimestamp/MaxTimestamp to QueryRequest and ScanQueryRequest |
| banyand/internal/sidx/element.go | Added timestamps slice to elements structure with proper initialization and swap operations |
| banyand/internal/sidx/block_writer.go | Implemented timestamp tracking during block writes with min/max computation |
| banyand/internal/sidx/part.go | Updated part initialization to handle timestamps with backward compatibility fallback |
| banyand/internal/sidx/part_wrapper.go | Added overlapsTimeRange method for efficient time-based filtering |
| banyand/internal/sidx/merge.go | Added recomputeTimestampRanges function to preserve timestamp metadata during merges |
| banyand/internal/sidx/query.go | Updated selectPartsForQuery to filter parts by both key and time ranges |
| banyand/internal/sidx/scan_query.go | Added time range filtering before scanning parts |
| banyand/internal/sidx/sidx.go | Updated ConvertToMemPart to include timestamp parameter |
| banyand/trace/write_standalone.go | Implemented timestamp extraction from trace tags for secondary index writes |
| banyand/internal/sidx/timestamp_feature_test.go | Comprehensive test suite for timestamp feature including JSON serialization |
| banyand/internal/sidx/part_wrapper_test.go | Extensive tests for overlapsTimeRange with various edge cases |
| banyand/internal/sidx/metadata_test.go | Tests for timestamp serialization and deserialization |
| banyand/internal/sidx/merge_test.go | Tests for timestamp range recomputation during merges |
| banyand/internal/sidx/tag_test.go | Updated WriteRequest test data to include Timestamp field |
| banyand/internal/sidx/snapshot_test.go | Updated WriteRequest test data to include Timestamp field |
| banyand/internal/sidx/element_test.go | Updated test data to initialize timestamps |
| banyand/internal/sidx/block_reader_test.go | Updated test element creation to include timestamps |
| banyand/internal/sidx/part_test.go | Updated test element creation to include timestamps |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Update timestamp ranges | ||
| if len(timestamps) > 0 { | ||
| for _, ts := range timestamps { | ||
| if ts != 0 { | ||
| if !bw.hasTimestamp { | ||
| bw.totalMinTimestamp = ts | ||
| bw.totalMaxTimestamp = ts | ||
| bw.hasTimestamp = true | ||
| } else { | ||
| if ts < bw.totalMinTimestamp { | ||
| bw.totalMinTimestamp = ts | ||
| } | ||
| if ts > bw.totalMaxTimestamp { | ||
| bw.totalMaxTimestamp = ts | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
Copilot
AI
Jan 18, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp tracking logic skips timestamps with value 0 (line 336). While this is likely intentional to distinguish "not set" from actual timestamps, consider documenting this behavior in the code or interface documentation. Unix epoch (timestamp 0) is technically a valid timestamp but would be ignored by this implementation. For a time-series database, this is probably acceptable as data from 1970 is unlikely, but it should be documented for clarity.
banyand/internal/sidx/interfaces.go
Outdated
| Tags []Tag | ||
| SeriesID common.SeriesID | ||
| Key int64 | ||
| Timestamp int64 // Unix nanoseconds timestamp for time range filtering |
Copilot
AI
Jan 18, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Timestamp field in WriteRequest should have a comment explaining that a value of 0 is treated as "not set" and will be ignored during timestamp range computation. This would help API users understand the expected behavior.
| Timestamp int64 // Unix nanoseconds timestamp for time range filtering | |
| Timestamp int64 // Unix nanoseconds timestamp for time range filtering; 0 means "not set" and is ignored during timestamp range computation |
|
@onkar717 Would you please to use "make lint" to check and fix your lint issues. |
banyand/internal/sidx/interfaces.go
Outdated
| MinTimestamp *int64 // Unix nanoseconds - for part-level selection only | ||
| MaxTimestamp *int64 // Unix nanoseconds - for part-level selection only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not used by anyone.
banyand/internal/sidx/interfaces.go
Outdated
| MinTimestamp *int64 // Unix nanoseconds - for part-level selection only | ||
| MaxTimestamp *int64 // Unix nanoseconds - for part-level selection only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not referenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
banyand/internal/sidx/scan_query.go
Outdated
| if !pw.overlapsKeyRange(minKey, maxKey) { | ||
| continue | ||
| } | ||
| if !pw.overlapsTimeRange(nil, nil) { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The time range filtering is hardcoded to pass nil, nil which means no timestamp filtering will be applied during scan queries. This defeats the purpose of adding timestamp range support. The ScanQueryRequest should include MinTimestamp and MaxTimestamp fields that can be passed here to enable time-based filtering optimization.
| if !pw.overlapsTimeRange(nil, nil) { | |
| if !pw.overlapsTimeRange(req.MinTimestamp, req.MaxTimestamp) { |
banyand/internal/sidx/query.go
Outdated
| asc := extractOrdering(req) | ||
|
|
||
| parts := selectPartsForQuery(snap, minKey, maxKey) | ||
| parts := selectPartsForQuery(snap, minKey, maxKey, nil, nil) |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time range filtering is hardcoded to nil, nil which means timestamp-based part selection optimization is not being utilized. To enable the time-based query optimization that this PR introduces, QueryRequest needs MinTimestamp and MaxTimestamp fields that can be passed to selectPartsForQuery.
| parts := selectPartsForQuery(snap, minKey, maxKey, nil, nil) | |
| parts := selectPartsForQuery(snap, minKey, maxKey, req.MinTimestamp, req.MaxTimestamp) |
banyand/internal/sidx/part.go
Outdated
| if len(es.timestamps) >= i { | ||
| seriesTimestamps = es.timestamps[blockStart:i] |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition len(es.timestamps) >= i is incorrect. The variable i is used as an exclusive end index for slicing (line 760), so the correct condition should be len(es.timestamps) > blockStart or len(es.timestamps) >= i should be changed to check against blockStart. As written, when timestamps are present, the fallback path will be incorrectly taken in many cases. For example, if there are 10 timestamps (indices 0-9) and i=10, the condition fails even though timestamps[0:10] would be valid.
| if len(es.timestamps) >= i { | |
| seriesTimestamps = es.timestamps[blockStart:i] | |
| if len(es.timestamps) > blockStart { | |
| // Use available timestamps starting from blockStart, up to either i or the end of timestamps. | |
| upper := i | |
| if upper > len(es.timestamps) { | |
| upper = len(es.timestamps) | |
| } | |
| seriesTimestamps = es.timestamps[blockStart:upper] | |
| // If we don't have enough timestamps to cover the whole block, pad with zeros | |
| if len(seriesTimestamps) < i-blockStart { | |
| padded := make([]int64, i-blockStart) | |
| copy(padded, seriesTimestamps) | |
| seriesTimestamps = padded | |
| } |
hanahmily
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MaxTimestamp and MinTimestamp are essential and must be utilized by the trace query process. Simply removing them is not the correct way.
Use them at
skywalking-banyandb/banyand/trace/query.go
Line 239 in fd52592
| req := sidx.QueryRequest{ |
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #948 +/- ##
==========================================
- Coverage 45.97% 36.86% -9.12%
==========================================
Files 328 133 -195
Lines 55505 14468 -41037
==========================================
- Hits 25520 5334 -20186
+ Misses 27909 8720 -19189
+ Partials 2076 414 -1662
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6a6a996 to
2d5329a
Compare
Add Optional Time Range Fields in manifest.json for Secondary Index
This PR adds optional
min_timestampandmax_timestampfields to the manifest.json file for secondary index parts. These fields are used to improve part selection during queries when time range filtering is required.Key Changes:
Added
MinTimestampandMaxTimestampoptional fields topartMetadata(manifest.json)Refactored
WriteRequestto explicitly includeTimestampfield as suggestedUpdated merge logic to compute and persist timestamp ranges after part merging
Enhanced part selection algorithm to use timestamp ranges for query optimization
Tests(including UT, IT, E2E) are added to verify the new feature.
If this pull request closes/resolves/fixes an existing issue, replace the issue number. Fixes apache/skywalking-banyandb#13620.
Update the
CHANGESlog.Issue - [Feature] Add Optional Time Range in
manifest.jsonfor Secondary Index skywalking#13620