Skip to content

Conversation

@Rima-Waleed
Copy link
Collaborator

@Rima-Waleed Rima-Waleed commented Dec 1, 2025

Study_ID Testing Instance Link Sample Count
ucs_msk_2024 https://triage.cbioportal.mskcc.org/study/summary?id=ucs_msk_2024 69 Samples
ilc_msk_2023 https://private.cbioportal.mskcc.org/study/summary?id=ilc_msk_2023 25 Samples
depmap_broad_2025 https://private.cbioportal.mskcc.org/study/summary?id=depmap_broad_2025 1,981 Samples
ccrcc_sjuh_2023 https://private.cbioportal.mskcc.org/study/summary?id=ccrcc_sjuh_2023 943 Samples
aml_stjude_2024 https://private.cbioportal.mskcc.org/study/summary?id=aml_stjude_2024 887 Samples
lgg_ctf_synodos_2025 https://private.cbioportal.mskcc.org/study/summary?id=lgg_ctf_synodos_2025 31 Samples
es_dsrct_msk_2023 https://private.cbioportal.mskcc.org/study/summary?id=es_dsrct_msk_2023 290 Samples
angs_painter_2025 https://www.cbioportal.org/study/summary?id=angs_painter_2025 328 Samples
blca_iatlas_imvigor210_2017 https://www.cbioportal.org/study/summary?id=blca_iatlas_imvigor210_2017 347 Samples
brca_iatlas_anders_2022 https://www.cbioportal.org/study/summary?id=brca_iatlas_anders_2022 31 Samples
rcc_iatlas_immotion150_2018 https://www.cbioportal.org/study/summary?id=rcc_iatlas_immotion150_2018 263 Samples
mel_iatlas_liu_2019 https://www.cbioportal.org/study/summary?id=mel_iatlas_liu_2019 122 Samples
mel_iatlas_hugo_ucla_2016 https://www.cbioportal.org/study/summary?id=mel_iatlas_hugo_ucla_2016 27 Samples
crc_hta8_htan_2024 https://private.cbioportal.mskcc.org/study/summary?id=crc_hta8_htan_2024 83 Samples
Total Samples 5,428 Samples

Notes:

Study_ID Missing Data Notes
lgg_ctf_synodos_2025 Missing clinical data, sample size, additional profiles Author clarification here
ucs_msk_2024 Clinical data + treatment data Reached out to author -> data cannot be provided for privacy concerns
ilc_msk_2023 Mutational signature data + clinical data Reached out to author, preparing data
depmap_broad_2025 Renamed for review purposes as ccle_broad_2025 (previous release) already exists in portal
ccrcc_sjuh_2023 clinical discrepancy (supplementary files VS. tables in the paper), OS_MONTHS, and sequencing platform groups Reached out to author, response pending
aml_stjude_2024 Arm-level CNA Reached out to author, response pending

@rmadupuri
Copy link
Collaborator

rmadupuri commented Dec 18, 2025

Thanks, @Rima-Waleed! The studies look good overall, and I have a few questions and suggestions for improvement.

ucs_msk_2024:

  • The samples were sequenced using IMPACT, so we should use DMP IDs for patient and sample identifiers, with EC IDs as display names, to maintain consistency and prevent duplicate samples with different IDs.
    *Author and collaborators wish to not use DMP IDs- author mentioned they did extensive manual curation and some patients may have diagnosis that doesn't match the msk impact study samples.
  • Can we make MSI Status as MSI-H when Score >= 10 as per paper?

ilc_msk_2023:

  • Update the study name to Breast Invasive Lobular Carcinoma (MSK, Nature 2024).
  • Are the two samples currently listed under Invasive Breast Carcinoma also lobular carcinomas?
    *Author emailed for confirmation, pending response

depmap_broad_2025:

  • Segment data doesnt look right. FGA is all close to 1. CNA shows a lot of Amplification?
    *Author confirmed seg mean values are linear and not log transformed.
Screenshot 2025-12-18 at 11 15 06
  • There are 94 samples without any genomic data? Is the case lists reflecting right numbers?
    *Samples were cross-checked in depmap portal to confirm they don't have genomic data
  • Do we need to keep the 94 samples above? Were they sequenced at all? If sequenced, case lists need updating.
    *Confirming with author if they were sequenced at all or not- if not, samples will be removed
  • The latest update includes mutation data for fewer samples compared to the previous release? 1854 vs 1970 counts.
    *Samples were cross-checked in depmap portal to confirm they don't have genomic data. Pending on author's confirmation.
  • Missing gene panel matrix
  • Missing TMB, Somatic Status attributes
    *Added TMB, confirming somatic status with authors.

ccrcc_sjuh_2023:

  • What does 0/X in Metastasis (TNM) Stage mean? Recode as NA/Unknown?
  • Rename RCC Specific (months) to RCC Specific Survival (Months)
  • Chart not loading
Screenshot 2025-12-18 at 11 33 19
  • What is the right sequencing platform used? the description says WES/Targeted and the clinical attribute shows WGS. Can we add a gene panel matrix?
    *Authors used WGS, WES and targeted sequencing. Author emailed for clarification (cohort C2 samples are not distinguished by sequencing platform). Sequencing platforms used by cohort group:
    C1 cohort: WGS
    C2 cohort: WES + targeted (42 genes not provided)
    C3: targeted (12 genes provided)
  • For the above, it appears that only the Targeted MAF from the 12 gene panel is available, based on the data mutations file and the paper's supplementary files. Please add a note to the README that only this data is shown and update the gene panel matrix from the MAF file.
  • Additionally, we can remove references to other sequencing platforms to avoid confusion.
Screenshot 2025-12-18 at 11 37 43
  • Can the mutations for WGS and the Targeted 42 gene panel be shared by the authors? The paper seems to refer to their previous studies.
    *Authors preparing the data, pending their response.
  • Recalculate TMB based on the 12 gene panel. Currently on the WES/WGS assumption, and the numbers are too low.

aml_stjude_2024:

  • Normalize Male in Sex attribute
  • Add TMB, Matched Normal status
  • Can we add a gene panel matrix?
  • I see all samples went through RNA-seq (from which only somatic mutations from 87 genes were reported Fig 1) and samples also went through WXS/WGS. How are you linking the targeted panel to the samples?
  • Can we update the study name to 'Pediatric Acute Myeloid Leukemia..'?
  • Missing RNA case list
  • RNA data is not loaded to portal instance.
  • Can we add the Tumor allele counts? See them in Supp 7.
  • Was Supp Table 8 also used in SV file? It lists duplications.

lgg_ctf_synodos_2025:

  • Can we make NF Open Science Initiative a hyperlink instead in the description?
  • Indicate that the expression profiles are TPM based?
  • Did all samples undergo WGS? The sample counts in the gene panel matrix and the mutation file don't match.

es_dsrct_msk_2023:

  • Can we add “to detect EWSR1 chromoplexy-associated structural rearrangements” to the description, since this is the main focus of the study?
  • What are the SV events in the data file that list only gene symbols without genomic coordinates or other details? Can these be removed as they appear to be duplicates of fully annotated events?
Screenshot 2026-01-09 at 11 25 45

@ritikakundra
Copy link
Collaborator

angs_painter_2025:

  • The description numbers and the portal numbers dont match.
  • Is the genomic frequency breakdown correct? 39% mutated?
  • Why are the event types in upper case
  • The links in the description is not wroking
  • [ ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants