fix: prevent integer overflow and truncation in MergedPileup weighted mean by nh13 · Pull Request #56 · fulcrumgenomics/fgsv

nh13 · 2026-02-20T21:48:13Z

Summary

Fix integer overflow in MergedPileup weighted mean calculation where p.total * p.left_pos (Int*Int) can overflow for large genomic positions
Fix per-term truncation where integer division inside map() truncates each term before summing, producing inaccurate mean positions
Use Long arithmetic for the intermediate product and perform a single division at the end

Test plan

Existing tests pass (./mill tools.test)
Verify with large genomic positions (e.g. chr1 at ~200M) that mean is calculated correctly

coderabbitai · 2026-02-21T01:56:47Z

Warning

Rate limit exceeded

@github-actions[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

This pull request makes two changes: the Tools index documentation version identifier is updated, and the FilterAndMerge calculation logic is modified to perform intermediate multiplication using 64-bit integers before division and casting to Int. The arithmetic change affects both left and right position calculations, replacing direct integer division with a two-step approach that computes the sum as a Long before converting the final result to Int.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and precisely describes the main change: fixing integer overflow and truncation in MergedPileup weighted mean calculation.
Description check	✅ Passed	The description clearly explains the two bugs being fixed, the solution approach, and provides a test plan aligned with the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/merged-pileup-integer-overflow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

nh13 · 2026-02-21T05:54:11Z

@coderabbitai review

coderabbitai · 2026-02-21T05:54:20Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/main/scala/com/fulcrumgenomics/sv/tools/FilterAndMerge.scala (1)
116-116: ⚠️ Potential issue | 🟡 Minor

Math.round(_.toFloat) re-introduces precision loss for large genomic positions.

leftMean / rightMean are already Int after the .toInt at lines 108–109. Converting them to Float (24-bit mantissa, exact only up to 2²³ ≈ 8.4 M) before rounding silently corrupts positions on any chromosome beyond ~8 Mbp — including the chr1 ~200 M case motivating this PR.
🐛 Proposed fix
-      left_mean      = Math.round(leftMean.toFloat),
+      left_mean      = leftMean,
...
-      right_mean     = Math.round(rightMean.toFloat),
+      right_mean     = rightMean,
Also applies to: 121-121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/scala/com/fulcrumgenomics/sv/tools/FilterAndMerge.scala` at line
116, The code is converting already-truncated Ints back to Float and rounding
(via Math.round(leftMean.toFloat / rightMean.toFloat)), which re-introduces
precision loss for large genomic positions; update the assignments that set
left_mean and right_mean in FilterAndMerge.scala to use the existing Int values
(leftMean and rightMean) directly (or cast to Long/Int as required by the
surrounding record type) instead of converting to Float and calling Math.round;
apply the same change to the symmetric occurrence for right_mean so the stored
positions remain exact.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/main/scala/com/fulcrumgenomics/sv/tools/FilterAndMerge.scala`:
- Line 116: The code is converting already-truncated Ints back to Float and
rounding (via Math.round(leftMean.toFloat / rightMean.toFloat)), which
re-introduces precision loss for large genomic positions; update the assignments
that set left_mean and right_mean in FilterAndMerge.scala to use the existing
Int values (leftMean and rightMean) directly (or cast to Long/Int as required by
the surrounding record type) instead of converting to Float and calling
Math.round; apply the same change to the symmetric occurrence for right_mean so
the stored positions remain exact.

… mean The weighted mean calculation `p.total * p.left_pos / total` had two bugs: 1. Integer overflow: `p.total * p.left_pos` is Int*Int which overflows for large genomic positions (up to ~250M) with moderate counts. 2. Per-term truncation: integer division inside map() truncates each term before summing, producing inaccurate results. E.g. three pileups with total=1 at pos=100 with overall total=3: each term becomes 1*100/3=33 (truncated), sum=99 instead of 100. Fix uses Long arithmetic for the product and performs a single division at the end.

nh13 requested review from clintval and tfenne as code owners February 20, 2026 21:48

nh13 force-pushed the fix/merged-pileup-integer-overflow branch from 23b6837 to 6f80ac3 Compare February 21, 2026 01:54

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

coderabbitai bot approved these changes Feb 21, 2026

View reviewed changes

nh13 force-pushed the fix/merged-pileup-integer-overflow branch from 94ed9fd to 346bfc9 Compare February 21, 2026 05:57

nh13 assigned clintval Feb 21, 2026

nh13 force-pushed the fix/merged-pileup-integer-overflow branch 3 times, most recently from 415c088 to fda903e Compare February 21, 2026 06:04

nh13 force-pushed the fix/merged-pileup-integer-overflow branch from 1eca25e to 21111db Compare February 21, 2026 06:20

Generate docs files

9a4690c

clintval approved these changes Feb 26, 2026

View reviewed changes

clintval assigned nh13 and unassigned clintval Feb 26, 2026

nh13 merged commit ca7b619 into main Mar 2, 2026
1 check passed

nh13 deleted the fix/merged-pileup-integer-overflow branch March 2, 2026 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent integer overflow and truncation in MergedPileup weighted mean#56

fix: prevent integer overflow and truncation in MergedPileup weighted mean#56
nh13 merged 2 commits intomainfrom
fix/merged-pileup-integer-overflow

nh13 commented Feb 20, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Uh oh!

nh13 commented Feb 21, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nh13 commented Feb 20, 2026

Summary

Test plan

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Uh oh!

nh13 commented Feb 21, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Feb 21, 2026 •

edited

Loading