Optimize/x ref phase2 by gaow · Pull Request #133 · StatFunGen/colocboost

gaow · 2026-03-29T13:27:43Z

No description provided.

… O(P^2) computation In summary statistics mode, XtX %*% beta was computed 3 separate times per iteration per outcome: in residual update, profile loglikelihood, and correlation update (get_correlation). This commit caches the product once after the beta update and reuses it, reducing the dominant O(P^2) cost by 3x. Also precomputes per-outcome constants (scaling_factor, beta_scaling) during model initialization to avoid repeated conditional evaluation per iteration. Benchmark results (micro-benchmark on XtX %*% beta): P=1000, L=2, M=100: 0.43s -> 0.14s (3x speedup) P=2000, L=5, M=200: 9.75s -> 3.25s (3x speedup) P=5000, L=10, M=500: 324s -> 108s (3x speedup, saves 216s) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…f LD When the number of variants P is large, the P×P LD matrix may be too large to fit in memory. Users can now pass X_ref (N_ref × P reference panel) directly instead of precomputing LD = cor(X_ref). When N_ref < P, ColocBoost computes LD products on the fly via t(X_ref) %*% (X_ref %*% v) / (N_ref - 1), avoiding the P×P memory cost. When N_ref >= P, LD is precomputed internally. Key design: - get_genotype_matrix(): returns $X or $X_ref from a data entry (mutually exclusive per entry: individual-level has $X, summary stats has $XtX or $X_ref) - compute_xtx_product(v, XtX, X_ref): unified XtX %*% v from either source - LD lookup functions (get_LD_jk, get_LD_jk1_jk2, get_LD_jk_each) reuse existing X path via get_genotype_matrix — no X_ref-specific code needed - dict_sumstatLD works for both LD and X_ref mapping Also: - Replace all Rfast:: calls with proper @importFrom (correls, standardise, upper_tri, med) and remove redundant local aliases - Add vignette section 3.4 with X_ref usage example - 23 new tests covering numerical equivalence, edge cases, error handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-29T13:36:48Z

Codecov Report

❌ Patch coverage is 86.47059% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.72%. Comparing base (2b5e1e1) to head (3595e60).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
R/colocboost.R	78.43%	11 Missing ⚠️
R/colocboost_init.R	85.10%	7 Missing ⚠️
R/colocboost_check_update_jk.R	55.55%	4 Missing ⚠️
R/colocboost_inference.R	95.83%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #133      +/-   ##
==========================================
+ Coverage   84.05%   84.72%   +0.66%     
==========================================
  Files          14       14              
  Lines        4828     4889      +61     
==========================================
+ Hits         4058     4142      +84     
+ Misses        770      747      -23

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gaow and others added 2 commits March 29, 2026 07:00

gaow closed this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize/x ref phase2#133

Optimize/x ref phase2#133
gaow wants to merge 2 commits intomainfrom
optimize/x-ref-phase2

gaow commented Mar 29, 2026

Uh oh!

codecov bot commented Mar 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gaow commented Mar 29, 2026

Uh oh!

codecov bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 29, 2026 •

edited

Loading