-
-
Notifications
You must be signed in to change notification settings - Fork 138
Open
Description
I noticed discrepancies in the Oncotree codes across different study IDs. For example, sample ID = TCGA-2E-A9G8-01
| Study ID | Oncotree code |
|---|---|
| ucec_tcga_pan_can_atlas_2018 | UEC |
| ucec_tcga | UCEC |
| ucec_tcga_gdc | UCEC |
I made the following assumptions about the cause of the difference:
1. GDC may have updated clinical annotations.
2. Different versions of the Oncotree code may have been applied.
3. The mapping strategy could have changed.
Update:
I checked history of public/ucec_tcga_pan_can_atlas_2018/data_clinical.txt and public/ucec_tcga/data_bcr_clinical_data_sample.txt and this discrepancy has persisted since beginning of file history in 2018.
I found several other issues referring to related questions but didn't manage to find an answer.
E.g. #1405
Could you please point me to the documentation of release notes on:
- Which version of the Oncotree code is used for each dataset? (2018-09-01 I guess)
- How is the Oncotree code derived from the original clinical annotations? Which information are taken into consideration when assigning the code?
- Release ICDO to Oncotree mappings #1793 mentioned icdo to oncotree codes mapping. Is it referring to this file in ontology mapping tool? If not, can you point me to the mapping file?
https://github.com/cBioPortal/oncotree/blob/master/scripts/ontology_to_ontology_mapping_tool/ontology_mappings.txt
Thank you for your help.
Metadata
Metadata
Assignees
Labels
No labels