Skip to content

Comments

DIA example with more than 150 samples added #574

Merged
ypriverol merged 3 commits intomainfrom
dev
Jan 21, 2026
Merged

DIA example with more than 150 samples added #574
ypriverol merged 3 commits intomainfrom
dev

Conversation

@ypriverol
Copy link
Member

@ypriverol ypriverol commented Jan 21, 2026

Pull Request

Description

Brief description of the changes made in this PR.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Test addition/update
  • Updates to the dependencies has been done.

Summary by CodeRabbit

  • New Features

    • Added continuous integration testing support for large-scale DIA datasets, with automated results reporting and artifact uploads.
  • Documentation

    • Updated documentation to include large-scale dataset reports and added a new Big quantms DIA example with associated dataset information and download links.

✏️ Tip: You can customize this high-level summary in your review settings.

@ypriverol ypriverol changed the title Dev DIA example with more than 150 samples added Jan 21, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

This PR adds support for testing and documenting a large-scale DIA dataset (PXD062383) by introducing a new CI job that downloads and processes the dataset with MultiQC, along with corresponding documentation and configuration entries.

Changes

Cohort / File(s) Summary
CI/Workflow Configuration
.github/workflows/python-app.yml
Adds new test_big_dia job that depends on setup, runs on ubuntu-latest, downloads the PXD062383 DIA dataset, executes MultiQC with quantms plugin and configuration, and uploads results as artifact results_big_dia. Job mirrors structure of existing test_fragpipe job.
Documentation
docs/README.md
Adds "Large-Scale Dataset Reports" section with table containing Example Type, Description, Link, and Dataset Download columns; new row documents "Big quantms DIA" dataset (added in two locations within the docs).
Dataset Configuration
docs/config.json
Adds two project entries (PXD062383 and PXD062383_disable_hoverinfo) with corresponding file types (["dia",""] and ["dia","disable_hoverinfo"]), URLs, and paths matching the structure of surrounding PXD062399 entries.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • major changes and bug fixing.  #130 — Modifies CI workflow DIA-processing steps in .github/workflows/python-app.yml and updates dataset/config paths
  • Added the examples for DIANN #173 — Adds dataset-specific CI jobs to .github/workflows/python-app.yml and corresponding entries in docs/config.json and README
  • Update CI/CD #118 — Modifies CI workflow and docs/config.json to add/update dataset downloads and MultiQC job configurations

Suggested labels

Review effort 3/5

Suggested reviewers

  • daichengxin

Poem

🐰 A big DIA dataset hops into the CI,
With workflows and configs stacked nice and high,
Documentation blooms in the README's care,
Large-scale results uploaded through the air! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a DIA example with more than 150 samples.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ypriverol ypriverol merged commit 57a67b1 into main Jan 21, 2026
22 of 23 checks passed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@docs/README.md`:
- Around line 252-255: The Markdown under the heading "### 🔍 Large-Scale
Dataset Reports" violates MD058 because the table immediately follows the
heading without a blank line; fix by inserting a single blank line between the
"### 🔍 Large-Scale Dataset Reports" heading and the table start (the line
beginning with "| Example Type | Description | Link | Dataset Download |") so
the table is separated from the heading.
🧹 Nitpick comments (1)
.github/workflows/python-app.yml (1)

299-302: Consider adding retries/timeouts for the large dataset download.

Given the size of PXD062383, adding retries/timeouts reduces flaky CI failures due to transient network issues.

✅ Optional hardening
-          wget -nv https://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/pmultiqc/example-projects/PXD062383.zip
+          wget -nv --retry-connrefused --waitretry=1 --read-timeout=20 --timeout=20 --tries=5 \
+            https://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/pmultiqc/example-projects/PXD062383.zip

Comment on lines +252 to +255
### 🔍 Large-Scale Dataset Reports
| Example Type | Description | Link | Dataset Download |
|---|---|---|---|
| Big quantms DIA | Data-independent acquisition | [Big quantms DIA - 165 samples](https://pmultiqc.quantms.org/PXD062383/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD062383_disable_hoverinfo/multiqc_report.html)) | [PXD062383.zip](https://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/pmultiqc/example-projects/PXD062383.zip) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a blank line before the table (MD058).

This section is missing a blank line before the table, which triggers markdownlint MD058.

🩹 Suggested fix
-### 🔍 Large-Scale Dataset Reports
-| Example Type | Description | Link | Dataset Download |
+### 🔍 Large-Scale Dataset Reports
+
+| Example Type | Description | Link | Dataset Download |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### 🔍 Large-Scale Dataset Reports
| Example Type | Description | Link | Dataset Download |
|---|---|---|---|
| Big quantms DIA | Data-independent acquisition | [Big quantms DIA - 165 samples](https://pmultiqc.quantms.org/PXD062383/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD062383_disable_hoverinfo/multiqc_report.html)) | [PXD062383.zip](https://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/pmultiqc/example-projects/PXD062383.zip) |
### 🔍 Large-Scale Dataset Reports
| Example Type | Description | Link | Dataset Download |
|---|---|---|---|
| Big quantms DIA | Data-independent acquisition | [Big quantms DIA - 165 samples](https://pmultiqc.quantms.org/PXD062383/multiqc_report.html) ([disable_hoverinfo](https://pmultiqc.quantms.org/PXD062383_disable_hoverinfo/multiqc_report.html)) | [PXD062383.zip](https://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/pmultiqc/example-projects/PXD062383.zip) |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

253-253: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

🤖 Prompt for AI Agents
In `@docs/README.md` around lines 252 - 255, The Markdown under the heading "###
🔍 Large-Scale Dataset Reports" violates MD058 because the table immediately
follows the heading without a blank line; fix by inserting a single blank line
between the "### 🔍 Large-Scale Dataset Reports" heading and the table start
(the line beginning with "| Example Type | Description | Link | Dataset Download
|") so the table is separated from the heading.

@coderabbitai coderabbitai bot mentioned this pull request Jan 22, 2026
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants