fix(source): validate descending ranges and guard timestamp counters by oritwoen · Pull Request #66 · oritwoen/vuke

oritwoen · 2026-03-08T18:21:25Z

Summary

reject descending and overflowing source ranges before processing to avoid underflow/overflow-driven incorrect workloads
fix timestamp microseconds workload accounting to match actual processing (+1 second + 1000 ms variants per second)
add focused unit tests for descending range/date rejection, overflow guards, pre-epoch date rejection, and microseconds-mode counting

Why

RangeSource and TimestampSource used inclusive-count arithmetic that could wrap on edge inputs (end - start + 1), and timestamp microseconds mode reported an incorrect total. This change makes source processing fail fast on invalid input and keeps progress/statistics consistent with real work.

Verification

lsp_diagnostics clean for modified files
cargo test
cargo build

coderabbitai · 2026-03-08T18:21:58Z

📝 Walkthrough

Walkthrough

Both RangeSource and timestamp handling now validate ranges (reject descending and pre-1970 dates) and use checked arithmetic to prevent size/overflow errors; new tests cover descending ranges, pre-epoch dates, and overflow scenarios.

Changes

Cohort / File(s)	Summary
Range validation `src/source/range.rs`	Added validation to reject descending ranges and used `checked_sub`/`checked_add` for overflow-safe count calculation; returns descriptive errors on overflow. Added tests for descending and overflowed ranges.
Timestamp validation `src/source/timestamps.rs`	Normalized and validated dates in `from_dates` (reject pre-1970, enforce end ≥ start). Strengthened `process` with range ordering checks, checked arithmetic for counts, and overflow detection for microseconds-mode multiplication. Added tests for pre-epoch, descending, and overflow cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I bounded through ranges, careful and spry,
Counting each hop beneath a safe sky.
No descending leaps, no overflowed cheer,
Checked math keeps the path crystal clear.
Hop on, dear code — the bounds are near! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main changes: validation of descending ranges and guarding of timestamp counters for overflow protection.
Description check	✅ Passed	The description clearly explains the changes, their rationale, and verification steps, directly relating to the modifications in range.rs and timestamps.rs.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/source-range-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cubic-dev-ai

No issues found across 2 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

Architecture diagram

sequenceDiagram
    participant R as Runner
    participant TS as TimestampSource
    participant RS as RangeSource
    participant PB as ProgressBar
    participant O as Output/Stats

    Note over R,TS: Timestamp Initialization
    R->>TS: from_dates(start, end, micros)
    alt NEW: Date Validation
        TS->>TS: Check if dates < 1970-01-01
        TS->>TS: Check if end < start
        TS-->>R: Result::Err (Bail)
    else Valid Range
        TS-->>R: Result::Ok(TimestampSource)
    end

    Note over R,RS: Processing Flow (Range or Timestamp)
    R->>RS: process(transforms, deriver, matcher, output)
    
    RS->>RS: NEW: Guard descending range (end < start)
    
    RS->>RS: NEW: Calculate count via checked_sub().checked_add(1)
    alt Overflow detected
        RS-->>R: Result::Err (Overflow)
    end

    opt CHANGED: If TimestampSource in Microseconds mode
        TS->>TS: total = count * 1001 (Sec + MS variants)
    end

    RS->>PB: new(total)
    
    loop For each batch (BATCH_SIZE)
        RS->>RS: Generate u64 inputs
        RS->>O: Push processed results
        RS->>PB: inc(batch_size)
    end

    RS-->>R: ProcessStats

codspeed-hq · 2026-03-08T18:24:55Z

Merging this PR will not alter performance

✅ 7 untouched benchmarks

_{Comparing fix/source-range-validation (4a151fc) with main (dde3dd4)}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/source/timestamps.rs`:
- Around line 85-91: The microseconds overflow guard is incomplete: even though
you use checked_mul(1001) for total, individual timestamp math (TimestampSource
with microseconds=true) can still overflow when computing ts * 1000 + ms; update
the code to either reject unsafe bounds up front (e.g. when start or end are too
large for millisecond conversion) or compute ts_ms using checked arithmetic (use
checked_mul(1000) and checked_add(ms) and propagate an error) before any further
math. Apply the same checked-math or upfront-bound-rejection fix around the ts,
ms computation referenced in the block around the current ts * 1000 + ms (lines
~107-109) so both total and per-timestamp calculations are safe.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 90a0f852-aba4-43b8-8bc2-c9c5a765cb33

📥 Commits

Reviewing files that changed from the base of the PR and between dde3dd4 and a87d224.

📒 Files selected for processing (2)

src/source/range.rs
src/source/timestamps.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: cubic · AI code reviewer
GitHub Check: benchmarks

🧰 Additional context used

📓 Path-based instructions (6)

src/{transform,analyze,source,output,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

All core trait implementations must satisfy Send + Sync bounds

Files:

src/source/timestamps.rs
src/source/range.rs

src/{analyze,source}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Use AtomicBool for early termination flags across multi-threaded contexts

Files:

src/source/timestamps.rs
src/source/range.rs

src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Prefer ? operator over .unwrap() for error handling in Rust code
Avoid panic!() macro; use Result types for error handling instead
Never use type suppression or type-unsafe casting patterns (equivalent to as any or @ts-ignore)

Files:

src/source/timestamps.rs
src/source/range.rs

src/source/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Implement the Source trait for new input sources in dedicated src/source/{name}.rs files

Files:

src/source/timestamps.rs
src/source/range.rs

src/source/*.rs

📄 CodeRabbit inference engine (src/source/AGENTS.md)

Create new source in src/source/{name}.rs file

Files:

src/source/timestamps.rs
src/source/range.rs

src/source/!(mod).rs

📄 CodeRabbit inference engine (src/source/AGENTS.md)

src/source/!(mod).rs: Implement Source trait with process() method accepting transforms, deriver, matcher, and output parameters
Use Rayon par_chunks() for batch processing and parallelism in source implementations
Report progress via optional ProgressBar using indicatif::ProgressBar in process() method
All sources must implement Send + Sync traits for thread safety

Files:

src/source/timestamps.rs
src/source/range.rs

🧠 Learnings (10)

📚 Learning: 2026-03-05T12:48:44.245Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/!(mod).rs : Implement `Source` trait with `process()` method accepting `transforms`, `deriver`, `matcher`, and `output` parameters

Applied to files:

src/source/timestamps.rs
src/source/range.rs

📚 Learning: 2026-03-02T14:18:44.316Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : Process inputs as batch operations using `&[Input]` as input and `&mut Vec<(String, Key)>` as output, where the first tuple element is a human-readable source description

Applied to files:

src/source/timestamps.rs
src/source/range.rs

📚 Learning: 2026-03-05T12:48:44.245Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/src/main.rs : Update `create_source()` function in `src/main.rs` to handle new source variants

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-05T12:48:44.245Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/src/main.rs : Update `SourceCommand` enum in `src/main.rs` when adding a new source type

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-05T12:48:44.245Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/!(mod).rs : Report progress via optional `ProgressBar` using `indicatif::ProgressBar` in `process()` method

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-02T14:18:00.377Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:00.377Z
Learning: Applies to src/{transform,analyze}/**/*.rs : Always process inputs and outputs as batches: `&[Input]` → `&mut Vec<(String, Key)>`

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-02T14:18:44.316Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : Every transform must implement the `Transform` trait with `name()` and `apply_batch()` methods, optionally implementing `supports_gpu()` and `apply_batch_gpu()` for GPU acceleration

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-02T14:18:44.316Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : The Input struct provides multiple representations (string, u64, big-endian bytes, little-endian bytes) - use the appropriate representation for each transform algorithm

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-02T14:18:00.377Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:00.377Z
Learning: Applies to src/main.rs : GPU implementations for generate/scan commands require Source trait redesign to support batch GPU processing

Applied to files:

src/source/range.rs

📚 Learning: 2026-03-05T12:40:29.177Z

Learnt from: oritwoen
Repo: oritwoen/vuke PR: 64
File: src/source/mod.rs:26-32
Timestamp: 2026-03-05T12:40:29.177Z
Learning: In `oritwoen/vuke`, `AGENTS.md` and `src/source/AGENTS.md` describe an **aspirational/future** `Source::process` trait signature (generic `T: Transform`, `O: Output`, with `no_gpu: bool` and `progress: Option<&ProgressBar>` parameters). The **current** implementation in `src/source/mod.rs` uses dynamic dispatch: `fn process(&self, transforms: &[Box<dyn Transform>], deriver: &KeyDeriver, matcher: Option<&Matcher>, output: &dyn Output) -> Result<ProcessStats>`. Migrating to the generic form is blocked by object-safety requirements (`Box<dyn Source>` is used in `src/main.rs`) and is a tracked future refactor. Do not flag the current dynamic-dispatch signature as non-conformant with AGENTS.md.

Applied to files:

src/source/range.rs

src/source/timestamps.rs

coderabbitai

🧹 Nitpick comments (1)

src/source/timestamps.rs (1)

23-31: Consider replacing .unwrap() with explicit error handling.

While and_hms_opt(0, 0, 0) and and_hms_opt(23, 59, 59) will never return None for these literal values, using .unwrap() violates the coding guideline preferring ? operator for error handling.

Suggested improvement

         let start = NaiveDate::parse_from_str(start_date, "%Y-%m-%d")?
             .and_hms_opt(0, 0, 0)
-            .unwrap()
+            .ok_or_else(|| anyhow::anyhow!("Invalid start time"))?
             .and_utc()
             .timestamp();

         let end = NaiveDate::parse_from_str(end_date, "%Y-%m-%d")?
             .and_hms_opt(23, 59, 59)
-            .unwrap()
+            .ok_or_else(|| anyhow::anyhow!("Invalid end time"))?
             .and_utc()
             .timestamp();

As per coding guidelines: "Prefer ? operator over .unwrap() for error handling in Rust code".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/source/timestamps.rs` around lines 23 - 31, Replace the .unwrap() calls
on the Option returned by and_hms_opt for the start and end timestamps with
explicit error propagation using .ok_or_else(...)? so failures map to the
function's Result error type; specifically change the start expression using
NaiveDate::parse_from_str(...).and_hms_opt(0,0,0).unwrap() to
.and_hms_opt(0,0,0).ok_or_else(|| /* descriptive error like format!("invalid
time for start_date {}", start_date) */ )? and do the same for the end
expression that calls .and_hms_opt(23,59,59); keep the surrounding
parse_from_str? propagation as-is so the function (timestamps creation code)
uses ? instead of panicking.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/source/timestamps.rs`:
- Around line 23-31: Replace the .unwrap() calls on the Option returned by
and_hms_opt for the start and end timestamps with explicit error propagation
using .ok_or_else(...)? so failures map to the function's Result error type;
specifically change the start expression using
NaiveDate::parse_from_str(...).and_hms_opt(0,0,0).unwrap() to
.and_hms_opt(0,0,0).ok_or_else(|| /* descriptive error like format!("invalid
time for start_date {}", start_date) */ )? and do the same for the end
expression that calls .and_hms_opt(23,59,59); keep the surrounding
parse_from_str? propagation as-is so the function (timestamps creation code)
uses ? instead of panicking.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 16437501-5d4e-423b-898e-2c80644037dd

📥 Commits

Reviewing files that changed from the base of the PR and between a87d224 and 4a151fc.

📒 Files selected for processing (1)

src/source/timestamps.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: benchmarks

🧰 Additional context used

📓 Path-based instructions (6)

src/{transform,analyze,source,output,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

All core trait implementations must satisfy Send + Sync bounds

Files:

src/source/timestamps.rs

src/{analyze,source}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Use AtomicBool for early termination flags across multi-threaded contexts

Files:

src/source/timestamps.rs

src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Prefer ? operator over .unwrap() for error handling in Rust code
Avoid panic!() macro; use Result types for error handling instead
Never use type suppression or type-unsafe casting patterns (equivalent to as any or @ts-ignore)

Files:

src/source/timestamps.rs

src/source/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Implement the Source trait for new input sources in dedicated src/source/{name}.rs files

Files:

src/source/timestamps.rs

src/source/*.rs

📄 CodeRabbit inference engine (src/source/AGENTS.md)

Create new source in src/source/{name}.rs file

Files:

src/source/timestamps.rs

src/source/!(mod).rs

📄 CodeRabbit inference engine (src/source/AGENTS.md)

src/source/!(mod).rs: Implement Source trait with process() method accepting transforms, deriver, matcher, and output parameters
Use Rayon par_chunks() for batch processing and parallelism in source implementations
Report progress via optional ProgressBar using indicatif::ProgressBar in process() method
All sources must implement Send + Sync traits for thread safety

Files:

src/source/timestamps.rs

🧠 Learnings (2)

📚 Learning: 2026-03-05T12:48:44.245Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/source/AGENTS.md:0-0
Timestamp: 2026-03-05T12:48:44.245Z
Learning: Applies to src/source/!(mod).rs : Implement `Source` trait with `process()` method accepting `transforms`, `deriver`, `matcher`, and `output` parameters

Applied to files:

src/source/timestamps.rs

📚 Learning: 2026-03-02T14:18:44.316Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/transform/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:44.316Z
Learning: Applies to src/transform/*.rs : Process inputs as batch operations using `&[Input]` as input and `&mut Vec<(String, Key)>` as output, where the first tuple element is a human-readable source description

Applied to files:

src/source/timestamps.rs

🔇 Additional comments (4)

src/source/timestamps.rs (4)

35-48: LGTM!

The validation logic correctly rejects pre-epoch dates and descending ranges with clear, user-friendly error messages. The use of u64::try_from provides defense-in-depth after the negativity check.

74-79: Overflow guard is now complete.

The bound check self.end > (u64::MAX - 999) / 1000 ensures that for all ts in the range, the computation ts * 1000 + ms at line 116 cannot overflow. This addresses the previous review concern about incomplete microseconds overflow protection.

109-125: LGTM!

The parallel iteration correctly processes 1001 items per timestamp in microseconds mode (1 base + 1000 variants), with progress bar increments matching the workload calculation. The use of Rayon's into_par_iter() satisfies the coding guideline for parallelism.

137-202: LGTM! Comprehensive test coverage.

The tests effectively cover:

Descending date range rejection via from_dates

Pre-epoch date rejection

Microseconds mode workload accounting (1001 inputs per timestamp)

Descending range rejection in process() for direct struct construction

Overflow detection for extreme ranges

fix(source): validate descending ranges and guard timestamp counters

a87d224

cubic-dev-ai bot approved these changes Mar 8, 2026

View reviewed changes

coderabbitai bot requested changes Mar 8, 2026

View reviewed changes

src/source/timestamps.rs Show resolved Hide resolved

fix(source): guard per-timestamp overflow in microseconds mode

4a151fc

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

coderabbitai bot approved these changes Mar 8, 2026

View reviewed changes

oritwoen self-assigned this Mar 8, 2026

oritwoen merged commit 2ab2a71 into main Mar 8, 2026
4 checks passed

oritwoen deleted the fix/source-range-validation branch March 8, 2026 19:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(source): validate descending ranges and guard timestamp counters#66

fix(source): validate descending ranges and guard timestamp counters#66
oritwoen merged 2 commits intomainfrom
fix/source-range-validation

oritwoen commented Mar 8, 2026

Uh oh!

coderabbitai bot commented Mar 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

codspeed-hq bot commented Mar 8, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oritwoen commented Mar 8, 2026

Summary

Why

Verification

Uh oh!

coderabbitai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Mar 8, 2026 •

edited

Loading

codspeed-hq bot commented Mar 8, 2026 •

edited

Loading