-
Notifications
You must be signed in to change notification settings - Fork 2
fix(spectronaut): Fix spectronaut converter to account for anomaly detection updates #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a PR template and a GitHub Actions dry-run build for PRs targeting devel; updates DESCRIPTION RoxygenNote; changes package-level roxygen to internal with Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Developer
participant GH as GitHub
participant WF as "GA: Dry runs for PRs"
participant Runner as "ubuntu-latest"
participant REnv as "R + Bioconductor (devel)"
participant Build as "Bioconductor Build/Check"
Dev->>GH: Open/Update PR (target: devel)
GH-->>WF: trigger: pull_request
WF->>Runner: run job (build)
Runner->>REnv: setup R & Bioconductor (devel)
Runner->>Runner: install dependencies
Runner->>Build: build, install, check package
Build-->>WF: report results (success / failure)
WF-->>GH: post status on PR
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (13)
.github/workflows/dry-run-build.yml (2)
18-18: Add trailing newline to satisfy yamllint.The linter error flags missing newline at EOF.
Apply this minimal fix:
- uses: grimbough/bioc-actions/build-install-check@v1 + uses: grimbough/bioc-actions/build-install-check@v1 +
1-7: Tighten workflow permissions and cancel superseded runs.Good security and CI hygiene; optional but recommended.
name: Dry runs for PRs on: pull_request: branches: [devel] +permissions: + contents: read +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true jobs: build: runs-on: ubuntu-latestman/MSstatsLiP.Rd (2)
7-17: Fix typos in package title/description (source roxygen).E.g., “cellur” → “cellular”, “abundace” → “abundance”. Please correct in R/MSstatsLiP.R so Rd regenerates cleanly.
Proposed roxygen edits (in R/MSstatsLiP.R):
-#' structures caused by compound binding in cellur lysates. +#' structures caused by compound binding in cellular lysates. ... -#' adjust the LiP peptide fold change for changes in overall protein abundace. +#' adjust the LiP peptide fold change for changes in overall protein abundance.
34-45: Deduplicate PCAPlotLiP entry in function list.The item appears twice in docs. Remove the duplicate in roxygen to avoid confusion.
R/MSstatsLiP.R:
-#' \item \code{\link{PCAPlotLiP}} : Runs PCA on the summarized data. Can -#' visualize the PCA analysis in three different plots..github/pull_request_template.md (1)
16-19: Point to this repo’s contributing guide and expand checklist.The link targets MSstatsConvert; consider this repo or an org-wide doc. Also add tests/NEWS/CI items.
- - [ ] I have read the MSstats [contributing guidelines](https://github.com/Vitek-Lab/MSstatsConvert/blob/master/.github/CONTRIBUTING.md) + - [ ] I have read the MSstatsLiP [contributing guidelines](https://github.com/Vitek-Lab/MSstatsLiP/blob/devel/CONTRIBUTING.md) - [ ] My changes generate no new warnings - [ ] Any dependent changes have been merged and published in downstream modules - [ ] I have run the devtools::document() command after my changes and committed the added files + - [ ] I added/updated unit tests and examples + - [ ] I updated NEWS.md (or equivalent changelog) + - [ ] CI (R CMD check/BiocCheck) passes with no new NOTES/WARNINGS/ERRORSman/SpectronauttoMSstatsLiPFormat.Rd (2)
98-99: Fix small description typos in source roxygen.“as as” → “as”, and use consistent “TrP”.
R/SpectronauttoMSstatsLiPFormat.R:
-#' Takes as as input both raw LiP and Trp outputs from Spectronaut. +#' Takes as input both raw LiP and TrP outputs from Spectronaut.
69-70: Boolean wording: use TRUE/FALSE.“True is default” → “TRUE is default” to match R conventions (fix roxygen).
R/MSstatsLiP.R (2)
1-12: Correct typos in roxygen header.Keeps user-facing docs clean.
-#' structures caused by compound binding in cellur lysates. +#' structures caused by compound binding in cellular lysates. ... -#' adjust the LiP peptide fold change for changes in overall protein abundace. +#' adjust the LiP peptide fold change for changes in overall protein abundance.
27-36: Remove duplicate PCAPlotLiP bullet.Docs list PCA twice.
-#' \item \code{\link{PCAPlotLiP}} : Runs PCA on the summarized data. Can -#' visualize the PCA analysis in three different plots.R/SpectronauttoMSstatsLiPFormat.R (4)
1-6: Docstring: minor typos + PR intent vs. changes.
- Fix “as as” → “as”; “Trp” → “TrP” (see separate diff).
- The PR title mentions “anomaly detection updates,” but no logic for anomaly flags/columns is evident. Please outline the behavioral change and add tests covering those cases.
I can help add unit tests once you describe the expected handling for anomaly-detected features/rows.
146-160: Good move to named arguments for SpectronauttoMSstatsFormat.Improves robustness against upstream signature drift. Consider passing
as.data.frame(LiP.data)for symmetry with the TrP call (optional).- df.lip <- SpectronauttoMSstatsFormat( - LiP.data, + df.lip <- SpectronauttoMSstatsFormat( + as.data.frame(LiP.data),
206-213: Verify peptide length filter: strict vs inclusive.If minimum allowed length is 6, use
>= 6instead of> 6. Current code excludes length-6 peptides.- which(nchar(df.fasta.lip$PeptideSequence) > min_len_peptide & str_count( + which(nchar(df.fasta.lip$PeptideSequence) >= min_len_peptide & str_count(
111-113: Consistency: data.frame vs data.table handoff (optional).You convert both inputs to data.table, then pass LiP as DT and TrP as DF. Upstream accepts DF; passing DF for both avoids accidental method dispatch/copies.
- LiP.data <- as.data.table(LiP.data) + LiP.data <- as.data.table(LiP.data) # or keep as DF and pass DF to both calls ... - Trp.data <- as.data.table(Trp.data) + Trp.data <- as.data.table(Trp.data) # ensure consistent treatment
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
.github/pull_request_template.md(1 hunks).github/workflows/dry-run-build.yml(1 hunks)DESCRIPTION(1 hunks)R/MSstatsLiP.R(1 hunks)R/SpectronauttoMSstatsLiPFormat.R(2 hunks)man/MSstatsLiP.Rd(2 hunks)man/SpectronauttoMSstatsLiPFormat.Rd(2 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
.github/workflows/dry-run-build.yml
[error] 18-18: no new line character at the end of file
(new-line-at-end-of-file)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (5)
DESCRIPTION (1)
34-34: LGTM on roxygen bump; ensure docs are regenerated with the same roxygen2 version.Confirm CI/dev uses roxygen2 7.3.2 and that
devtools::document()was run so Rd files aren’t out of sync.man/MSstatsLiP.Rd (1)
58-71: Maintainer/authors update looks good.man/SpectronauttoMSstatsLiPFormat.Rd (1)
5-6: Title typo fix LGTM.R/MSstatsLiP.R (1)
43-45: Switch to _PACKAGE marker and internal keyword looks good.R/SpectronauttoMSstatsLiPFormat.R (1)
116-141: Error condition when either LiP or TrP has no matching conditions.The
|logic intentionally errors if either dataset has zero overlap withwhich.Conditions. Confirm this is desired UX; otherwise, consider warning and proceeding with the dataset that has overlap.If you prefer proceed-on-partial-overlap:
- if((length(setdiff(LiP_conditions, which.Conditions) - ) == length(LiP_conditions)) | - (length(setdiff(TrP_conditions, which.Conditions) - ) == length(TrP_conditions))){ + if((length(setdiff(LiP_conditions, which.Conditions)) == length(LiP_conditions)) && + (length(setdiff(TrP_conditions, which.Conditions)) == length(TrP_conditions))){
| verbose = verbose, | ||
| log_file_path = path | ||
| ) | ||
| df.lip <- as.data.table(as.matrix(df.lip)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: as.matrix coerces all columns to character; breaks downstream numeric ops.
as.data.table(as.matrix(df.lip)) converts numerics (e.g., Intensity) to character, making <= 1 comparisons and merges unreliable.
Apply:
- df.lip <- as.data.table(as.matrix(df.lip))
+ df.lip <- as.data.table(df.lip)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| df.lip <- as.data.table(as.matrix(df.lip)) | |
| df.lip <- as.data.table(df.lip) |
🤖 Prompt for AI Agents
In R/SpectronauttoMSstatsLiPFormat.R around line 161, the call
as.data.table(as.matrix(df.lip)) coerces all columns to character (breaking
numeric comparisons and merges); replace it by converting the data.frame
directly to data.table (as.data.table(df.lip)) or, if df.lip is a list/mixed
type, build the data.table and explicitly convert numeric columns back using
something like dt[, (numCols) := lapply(.SD, as.numeric), .SDcols = numCols]
where numCols includes Intensity and any other numeric fields required
downstream. Ensure the resulting data.table preserves numeric types before any
<= or numeric merges.
| annotation = annotation, | ||
| intensity = intensity, | ||
| filter_with_Qvalue = filter_with_Qvalue, | ||
| qvalue_cutoff = qvalue_cutoff, | ||
| useUniquePeptide = useUniquePeptide, | ||
| removeFewMeasurements = removeFewMeasurements, | ||
| removeProtein_with1Feature = removeProtein_with1Feature, | ||
| summaryforMultipleRows = summaryforMultipleRows, | ||
| use_log_file = use_log_file, | ||
| append = append, | ||
| verbose = verbose, | ||
| log_file_path = path | ||
| ) | ||
| df.trp <- as.data.table(as.matrix(df.trp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeat the same fix for TrP path.
Avoid character coercion for TrP as well.
- df.trp <- as.data.table(as.matrix(df.trp))
+ df.trp <- as.data.table(df.trp)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| as.data.frame(Trp.data), | |
| annotation = annotation, | |
| intensity = intensity, | |
| filter_with_Qvalue = filter_with_Qvalue, | |
| qvalue_cutoff = qvalue_cutoff, | |
| useUniquePeptide = useUniquePeptide, | |
| removeFewMeasurements = removeFewMeasurements, | |
| removeProtein_with1Feature = removeProtein_with1Feature, | |
| summaryforMultipleRows = summaryforMultipleRows, | |
| use_log_file = use_log_file, | |
| append = append, | |
| verbose = verbose, | |
| log_file_path = path | |
| ) | |
| df.trp <- as.data.table(df.trp) |
🤖 Prompt for AI Agents
In R/SpectronauttoMSstatsLiPFormat.R around lines 166 to 179, repeat the same
fix applied to the Trp path for the TrP path: stop coercing TrP into characters
by removing the direct as.data.frame(Trp.data) call passed into the function and
instead ensure TrP is a proper data.frame before the call (e.g., if not already
a data.frame, convert it with as.data.frame(TrP.data, stringsAsFactors =
FALSE)), then pass the TrP data.frame variable as-is to avoid character
coercion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
R/dataProcessPlotsLiP.R (1)
39-41: Roxygen docs out of sync with actual behavior and units.Docs still say “pdf” and list 10/10 defaults; they don’t reflect Plotly HTML output or the new device‑aware defaults. This will be overwritten on the next roxygen run and confuse users.
Update parameter docs to state units per backend and correct defaults:
-#' @param width width of the saved pdf file. Default is 10. -#' @param height height of the saved pdf file. Default is 10. +#' @param width Width of the saved output. +#' If isPlotly = FALSE (ggplot2/PDF): inches (default 10). +#' If isPlotly = TRUE (Plotly/HTML): pixels (default 800). +#' @param height Height of the saved output. +#' If isPlotly = FALSE (ggplot2/PDF): inches (default 12). +#' If isPlotly = TRUE (Plotly/HTML): pixels (default 600).
🧹 Nitpick comments (1)
R/dataProcessPlotsLiP.R (1)
33-34: Doc default for text.angle mismatches code default (0 vs 90).Function signature sets
text.angle = 90, but docs say default is 0.-#' @param text.angle angle of labels represented each condition at the top of -#' Profile plot and QC plot. Default is 0. +#' @param text.angle angle of labels represented each condition at the top of +#' Profile plot and QC plot. Default is 90.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
R/dataProcessPlotsLiP.R(1 hunks)man/dataProcessPlotsLiP.Rd(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- man/dataProcessPlotsLiP.Rd
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
R/dataProcessPlotsLiP.R
Outdated
| width = 800, | ||
| height = 600, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Critical: width/height units regression (inches → pixels) will generate gigantic PDFs when isPlotly = FALSE.
Changing defaults from 10/12 (inches) to 800/600 (pixels) is unsafe because downstream dataProcessPlotsPTM treats ggplot2/PDF sizes as inches. This risks 800×600 inch PDFs, timeouts, and Bioconductor check failures. Prefer device‑aware defaults.
Apply this minimal, backwards‑compatible change to make defaults device‑aware:
- width = 800,
- height = 600,
+ width = NULL,
+ height = NULL,Then, initialize sensible defaults inside the function (add near the top of the body, before use):
# Set device-aware defaults
if (isTRUE(isPlotly)) {
if (is.null(width)) width <- 800 # px
if (is.null(height)) height <- 600 # px
} else {
if (is.null(width)) width <- 10 # inches
if (is.null(height)) height <- 12 # inches
}Verification script to find call sites relying on implicit defaults:
Also, this plotting-default change seems unrelated to the Spectronaut converter fix in this PR. Was this intentional? If not, please split to a separate PR to reduce risk.
🏁 Script executed:
#!/bin/bash
# Find usages of dataProcessPlotsLiP without explicit width/height to assess impact
rg -nP --type=r -C2 'dataProcessPlotsLiP\s*\(' | sed -n 'p'Length of output: 3234
Critical — width/height units regression (inches → pixels) will produce gigantic PDFs when isPlotly = FALSE.
Found call sites that rely on implicit defaults: vignettes/MSstatsLiP_Workflow.Rmd (calls at ~lines 163, 173) and inst/tinytest/test_dataProcessPlotsLiP.R (tests call with defaults). Change is unsafe; apply a device‑aware, backwards‑compatible fix:
- width = 800,
- height = 600,
+ width = NULL,
+ height = NULL,Initialize device-aware defaults near the top of the function body:
# Set device-aware defaults
if (isTRUE(isPlotly)) {
if (is.null(width)) width <- 800 # px
if (is.null(height)) height <- 600 # px
} else {
if (is.null(width)) width <- 10 # inches
if (is.null(height)) height <- 12 # inches
}This plotting-default change appears unrelated to the Spectronaut converter fix — split into a separate PR or thoroughly test/justify here.
🤖 Prompt for AI Agents
In R/dataProcessPlotsLiP.R around lines 86-87, the recent change hardcodes
width/height in pixels causing huge PDFs when isPlotly = FALSE; add
device-aware, backwards-compatible defaults near the top of the function body
that only set width/height when NULL: if isPlotly use pixel defaults (e.g.
800x600), otherwise use inch defaults (e.g. 10x12); implement this conditional
assignment before any plotting-device-specific calls so existing vignettes and
tests that rely on implicit defaults continue to work.
This reverts commit ea2f7d2.
Motivation and Context
Build on bioconductor is failing due to Vitek-Lab/MSstatsConvert#111
Changes
Testing
Dry run build should succeed.
Checklist Before Requesting a Review
Summary by CodeRabbit
Documentation
Chores
Refactor
Style
Bug Fix / Behavior