fix(source): validate descending ranges and guard timestamp counters#66
fix(source): validate descending ranges and guard timestamp counters#66
Conversation
📝 WalkthroughWalkthroughBoth Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
No issues found across 2 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Architecture diagram
sequenceDiagram
participant R as Runner
participant TS as TimestampSource
participant RS as RangeSource
participant PB as ProgressBar
participant O as Output/Stats
Note over R,TS: Timestamp Initialization
R->>TS: from_dates(start, end, micros)
alt NEW: Date Validation
TS->>TS: Check if dates < 1970-01-01
TS->>TS: Check if end < start
TS-->>R: Result::Err (Bail)
else Valid Range
TS-->>R: Result::Ok(TimestampSource)
end
Note over R,RS: Processing Flow (Range or Timestamp)
R->>RS: process(transforms, deriver, matcher, output)
RS->>RS: NEW: Guard descending range (end < start)
RS->>RS: NEW: Calculate count via checked_sub().checked_add(1)
alt Overflow detected
RS-->>R: Result::Err (Overflow)
end
opt CHANGED: If TimestampSource in Microseconds mode
TS->>TS: total = count * 1001 (Sec + MS variants)
end
RS->>PB: new(total)
loop For each batch (BATCH_SIZE)
RS->>RS: Generate u64 inputs
RS->>O: Push processed results
RS->>PB: inc(batch_size)
end
RS-->>R: ProcessStats
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/source/timestamps.rs`:
- Around line 85-91: The microseconds overflow guard is incomplete: even though
you use checked_mul(1001) for total, individual timestamp math (TimestampSource
with microseconds=true) can still overflow when computing ts * 1000 + ms; update
the code to either reject unsafe bounds up front (e.g. when start or end are too
large for millisecond conversion) or compute ts_ms using checked arithmetic (use
checked_mul(1000) and checked_add(ms) and propagate an error) before any further
math. Apply the same checked-math or upfront-bound-rejection fix around the ts,
ms computation referenced in the block around the current ts * 1000 + ms (lines
~107-109) so both total and per-timestamp calculations are safe.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 90a0f852-aba4-43b8-8bc2-c9c5a765cb33
📒 Files selected for processing (2)
src/source/range.rssrc/source/timestamps.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: benchmarks
🧰 Additional context used
📓 Path-based instructions (6)
src/{transform,analyze,source,output,storage}/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
All core trait implementations must satisfy
Send + Syncbounds
Files:
src/source/timestamps.rssrc/source/range.rs
src/{analyze,source}/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
Use
AtomicBoolfor early termination flags across multi-threaded contexts
Files:
src/source/timestamps.rssrc/source/range.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Prefer?operator over.unwrap()for error handling in Rust code
Avoidpanic!()macro; use Result types for error handling instead
Never use type suppression or type-unsafe casting patterns (equivalent toas anyor@ts-ignore)
Files:
src/source/timestamps.rssrc/source/range.rs
src/source/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
Implement the Source trait for new input sources in dedicated
src/source/{name}.rsfiles
Files:
src/source/timestamps.rssrc/source/range.rs
src/source/*.rs
📄 CodeRabbit inference engine (src/source/AGENTS.md)
Create new source in
src/source/{name}.rsfile
Files:
src/source/timestamps.rssrc/source/range.rs
src/source/!(mod).rs
📄 CodeRabbit inference engine (src/source/AGENTS.md)
src/source/!(mod).rs: ImplementSourcetrait withprocess()method acceptingtransforms,deriver,matcher, andoutputparameters
Use Rayonpar_chunks()for batch processing and parallelism in source implementations
Report progress via optionalProgressBarusingindicatif::ProgressBarinprocess()method
All sources must implementSend + Synctraits for thread safety
Files:
src/source/timestamps.rssrc/source/range.rs
🧠 Learnings (10)
📚 Learning: 2026-03-05T12:48:44.245Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/!(mod).rs : Implement `Source` trait with `process()` method accepting `transforms`, `deriver`, `matcher`, and `output` parameters
Applied to files:
src/source/timestamps.rssrc/source/range.rs
📚 Learning: 2026-03-02T14:18:44.316Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : Process inputs as batch operations using `&[Input]` as input and `&mut Vec<(String, Key)>` as output, where the first tuple element is a human-readable source description
Applied to files:
src/source/timestamps.rssrc/source/range.rs
📚 Learning: 2026-03-05T12:48:44.245Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/src/main.rs : Update `create_source()` function in `src/main.rs` to handle new source variants
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-05T12:48:44.245Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/src/main.rs : Update `SourceCommand` enum in `src/main.rs` when adding a new source type
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-05T12:48:44.245Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/!(mod).rs : Report progress via optional `ProgressBar` using `indicatif::ProgressBar` in `process()` method
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-02T14:18:00.377Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:00.377Z
Learning: Applies to src/{transform,analyze}/**/*.rs : Always process inputs and outputs as batches: `&[Input]` → `&mut Vec<(String, Key)>`
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-02T14:18:44.316Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : Every transform must implement the `Transform` trait with `name()` and `apply_batch()` methods, optionally implementing `supports_gpu()` and `apply_batch_gpu()` for GPU acceleration
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-02T14:18:44.316Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : The Input struct provides multiple representations (string, u64, big-endian bytes, little-endian bytes) - use the appropriate representation for each transform algorithm
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-02T14:18:00.377Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:00.377Z
Learning: Applies to src/main.rs : GPU implementations for generate/scan commands require Source trait redesign to support batch GPU processing
Applied to files:
src/source/range.rs
📚 Learning: 2026-03-05T12:40:29.177Z
Learnt from: oritwoen
Repo: oritwoen/vuke PR: 64
File: src/source/mod.rs:26-32
Timestamp: 2026-03-05T12:40:29.177Z
Learning: In `oritwoen/vuke`, `AGENTS.md` and `src/source/AGENTS.md` describe an **aspirational/future** `Source::process` trait signature (generic `T: Transform`, `O: Output`, with `no_gpu: bool` and `progress: Option<&ProgressBar>` parameters). The **current** implementation in `src/source/mod.rs` uses dynamic dispatch: `fn process(&self, transforms: &[Box<dyn Transform>], deriver: &KeyDeriver, matcher: Option<&Matcher>, output: &dyn Output) -> Result<ProcessStats>`. Migrating to the generic form is blocked by object-safety requirements (`Box<dyn Source>` is used in `src/main.rs`) and is a tracked future refactor. Do not flag the current dynamic-dispatch signature as non-conformant with AGENTS.md.
Applied to files:
src/source/range.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/source/timestamps.rs (1)
23-31: Consider replacing.unwrap()with explicit error handling.While
and_hms_opt(0, 0, 0)andand_hms_opt(23, 59, 59)will never returnNonefor these literal values, using.unwrap()violates the coding guideline preferring?operator for error handling.Suggested improvement
let start = NaiveDate::parse_from_str(start_date, "%Y-%m-%d")? .and_hms_opt(0, 0, 0) - .unwrap() + .ok_or_else(|| anyhow::anyhow!("Invalid start time"))? .and_utc() .timestamp(); let end = NaiveDate::parse_from_str(end_date, "%Y-%m-%d")? .and_hms_opt(23, 59, 59) - .unwrap() + .ok_or_else(|| anyhow::anyhow!("Invalid end time"))? .and_utc() .timestamp();As per coding guidelines: "Prefer
?operator over.unwrap()for error handling in Rust code".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/source/timestamps.rs` around lines 23 - 31, Replace the .unwrap() calls on the Option returned by and_hms_opt for the start and end timestamps with explicit error propagation using .ok_or_else(...)? so failures map to the function's Result error type; specifically change the start expression using NaiveDate::parse_from_str(...).and_hms_opt(0,0,0).unwrap() to .and_hms_opt(0,0,0).ok_or_else(|| /* descriptive error like format!("invalid time for start_date {}", start_date) */ )? and do the same for the end expression that calls .and_hms_opt(23,59,59); keep the surrounding parse_from_str? propagation as-is so the function (timestamps creation code) uses ? instead of panicking.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/source/timestamps.rs`:
- Around line 23-31: Replace the .unwrap() calls on the Option returned by
and_hms_opt for the start and end timestamps with explicit error propagation
using .ok_or_else(...)? so failures map to the function's Result error type;
specifically change the start expression using
NaiveDate::parse_from_str(...).and_hms_opt(0,0,0).unwrap() to
.and_hms_opt(0,0,0).ok_or_else(|| /* descriptive error like format!("invalid
time for start_date {}", start_date) */ )? and do the same for the end
expression that calls .and_hms_opt(23,59,59); keep the surrounding
parse_from_str? propagation as-is so the function (timestamps creation code)
uses ? instead of panicking.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 16437501-5d4e-423b-898e-2c80644037dd
📒 Files selected for processing (1)
src/source/timestamps.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: benchmarks
🧰 Additional context used
📓 Path-based instructions (6)
src/{transform,analyze,source,output,storage}/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
All core trait implementations must satisfy
Send + Syncbounds
Files:
src/source/timestamps.rs
src/{analyze,source}/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
Use
AtomicBoolfor early termination flags across multi-threaded contexts
Files:
src/source/timestamps.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Prefer?operator over.unwrap()for error handling in Rust code
Avoidpanic!()macro; use Result types for error handling instead
Never use type suppression or type-unsafe casting patterns (equivalent toas anyor@ts-ignore)
Files:
src/source/timestamps.rs
src/source/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
Implement the Source trait for new input sources in dedicated
src/source/{name}.rsfiles
Files:
src/source/timestamps.rs
src/source/*.rs
📄 CodeRabbit inference engine (src/source/AGENTS.md)
Create new source in
src/source/{name}.rsfile
Files:
src/source/timestamps.rs
src/source/!(mod).rs
📄 CodeRabbit inference engine (src/source/AGENTS.md)
src/source/!(mod).rs: ImplementSourcetrait withprocess()method acceptingtransforms,deriver,matcher, andoutputparameters
Use Rayonpar_chunks()for batch processing and parallelism in source implementations
Report progress via optionalProgressBarusingindicatif::ProgressBarinprocess()method
All sources must implementSend + Synctraits for thread safety
Files:
src/source/timestamps.rs
🧠 Learnings (2)
📚 Learning: 2026-03-05T12:48:44.245Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/!(mod).rs : Implement `Source` trait with `process()` method accepting `transforms`, `deriver`, `matcher`, and `output` parameters
Applied to files:
src/source/timestamps.rs
📚 Learning: 2026-03-02T14:18:44.316Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : Process inputs as batch operations using `&[Input]` as input and `&mut Vec<(String, Key)>` as output, where the first tuple element is a human-readable source description
Applied to files:
src/source/timestamps.rs
🔇 Additional comments (4)
src/source/timestamps.rs (4)
35-48: LGTM!The validation logic correctly rejects pre-epoch dates and descending ranges with clear, user-friendly error messages. The use of
u64::try_fromprovides defense-in-depth after the negativity check.
74-79: Overflow guard is now complete.The bound check
self.end > (u64::MAX - 999) / 1000ensures that for alltsin the range, the computationts * 1000 + msat line 116 cannot overflow. This addresses the previous review concern about incomplete microseconds overflow protection.
109-125: LGTM!The parallel iteration correctly processes 1001 items per timestamp in microseconds mode (1 base + 1000 variants), with progress bar increments matching the workload calculation. The use of Rayon's
into_par_iter()satisfies the coding guideline for parallelism.
137-202: LGTM! Comprehensive test coverage.The tests effectively cover:
- Descending date range rejection via
from_dates- Pre-epoch date rejection
- Microseconds mode workload accounting (1001 inputs per timestamp)
- Descending range rejection in
process()for direct struct construction- Overflow detection for extreme ranges
Summary
+1 second+1000 ms variantsper second)Why
RangeSourceandTimestampSourceused inclusive-count arithmetic that could wrap on edge inputs (end - start + 1), and timestamp microseconds mode reported an incorrect total. This change makes source processing fail fast on invalid input and keeps progress/statistics consistent with real work.Verification
lsp_diagnosticsclean for modified filescargo testcargo build