You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-10Lines changed: 1 addition & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,6 @@ Molecular Oncology Almanac is a clinical interpretation algorithm for cancer gen
9
9
- Identify overlap between somatic variants observed from both DNA and RNA, or any other source of validation sequencing.
10
10
- Identify somatic and germline variants that may be related to microsatellite stability.
11
11
- Calculate coding mutational burden and compare your patient to TCGA.
12
-
- Calculate contribution of known [COSMIC mutational signatures](https://cancer.sanger.ac.uk/signatures/signatures_v2/) with [deconstructsigs](https://github.com/raerose01/deconstructSigs).
13
12
- Identify genomic features that may be related to one another.
@@ -19,7 +18,7 @@ You can view additional documentation, including [descriptions of inputs](docs/d
19
18
The codebase is available for download through this GitHub repository, [Dockerhub](https://hub.docker.com/r/vanallenlab/moalmanac/), and [Terra](https://portal.firecloud.org/#methods/vanallenlab/moalmanac/2). The method can also be run on Terra, without having to use Terra, by using [our portal](https://portal.moalmanac.org/). **Accessing Molecular Oncology Almanac through GitHub will require building some of the [datasources](moalmanac/datasources/) but they are also contained in the Docker container**.
20
19
21
20
### Installation
22
-
Molecular Oncology Almanac is a Python application using Python 3.11 but also utilizes R to run [deconstructSigs](https://github.com/raerose01/deconstructSigs) as a subprocess. This application, datasources, and all dependencies are packaged on Docker and can be downloaded with the command
21
+
Molecular Oncology Almanac is a Python application using Python 3.11. This application, datasources, and all dependencies are packaged on Docker and can be downloaded with the command
23
22
```bash
24
23
docker pull vanallenlab/moalmanac
25
24
```
@@ -36,14 +35,6 @@ source activate moalmanac
36
35
pip install -r requirements.txt
37
36
```
38
37
39
-
You can install [deconstructSigs](https://github.com/raerose01/deconstructSigs) after [installing R](https://www.r-project.org/) with the following commands
Copy file name to clipboardExpand all lines: docs/description-of-inputs.md
+19-1Lines changed: 19 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,7 @@ Example inputs can be found in the [`example_data/`](/example_data/) folder, fou
16
16
-[Germline variants](#germline-variants)
17
17
-[Somatic variants from validation sequencing](#somatic-variants-from-validation-sequencing)
18
18
-[Microsatellite status](#microsatellite-status)
19
+
-[Mutational signatures](#mutational-signatures)
19
20
-[Purity](#purity)
20
21
-[Ploidy](#ploidy)
21
22
-[Whole genome doubling](#whole-genome-doubling)
@@ -124,7 +125,7 @@ This input is looking for an integer value.
124
125
125
126
The rows associated with _TP53_, _CDKN2A_, and _EGFR_ will be interpreted and scored by Molecular Oncology Almanac while _BRAF_ will be filtered.
126
127
127
-
### Required files
128
+
### Required fields
128
129
Required fields can be changed from their default expectations by editing the appropriate section of [colnames.ini](https://github.com/vanallenlab/moalmanac/blob/main/moalmanac/colnames.ini). Column names are **not** case-sensitive.
129
130
-`gene`, gene symbol associated with the copy number alteration
130
131
-`call`, copy number event of the gene. `Amplification` and `Deletion` are accepted and all other values will be filtered.
@@ -238,6 +239,23 @@ At least one of the following also must be included:
238
239
239
240
Microsatellite status is reported in the clinical actionability report.
240
241
242
+
## Mutational signatures
243
+
`--mutational_signatures` anticipates a tab delimited file which contains contributions to Single Base Substitution (SBS) Mutational Signatures from [COSMIC version 3.4](https://cancer.sanger.ac.uk/signatures/sbs/). The file should only contain signature contributions for the tumor sample being studied. We recommend generating SBS mutational signatures with [SigProfilerAssignment](https://github.com/AlexandrovLab/SigProfilerAssignment), and have prepared [a wrapper GitHub repository](https://github.com/vanallenlab/SigProfilerAssignment-wrapper) to run SigProfilerAssignment and format signature contributions as expected.
244
+
245
+
### Example
246
+
| signature | contribution |
247
+
|---|--------------|
248
+
| SBS1 | 0.03846154 |
249
+
| SBS2 | 0 |
250
+
| SBS3 | 0.8525641 |
251
+
| ... | ... |
252
+
| SBS95 | 0 |
253
+
254
+
### Required fields,
255
+
The required fields for this file can be changed from their default expectations by editing the appropriate section of `colnames.ini`. Column names are **not** case sensitive.
256
+
-`signature`, labels for each of the 79 SBS mutational signatures included in COSMIC mutational signatures [version 3.4](https://cancer.sanger.ac.uk/signatures/sbs/)
257
+
-`contribution`, a float value between 0 and 1 for the row's associated signature weight. This column's values should sum to 1.
258
+
241
259
## Purity
242
260
`--purity` anticipates a float value between 0.0 and 1.0 for the reported tumor purity. This is just used for reporting in the clinical actionability report.
* Rearrangements: gene name, Molecular Oncology Almanac will process each partner in the fusion separately
64
59
* Microsatellite stability: microsatellite stability status (MSI-High or MSI-Low)
65
60
* Mutational burden: High Mutational Burden, if the mutational burden is deemed to be high
66
-
* Mutational signatures: the specific COSMIC (v2) mutational signature, formatted as "COSMIC Signature (number)"
61
+
* Mutational signatures: the specific COSMIC (v3.4) mutational signature, formatted as "COSMIC Signature (number)"
67
62
* Aneuploidy: Whole-genome doubling, this will only be populated if the `--wgd` value is passed to Molecular Oncology Almanac.
68
63
*`alteration_type` is a descriptor to provide more granular detail on the molecular event.
69
64
* Somatic variants: variant classification of the variant (Missense, Nonsense, etc.)
@@ -319,31 +314,8 @@ Molecular Oncology Almanac designates high mutational burden under two circumsta
319
314
- Mutations per Mb > 10
320
315
- At least a mutational burden of 80th percentile of TCGA tumor type, if matched, or TCGA generally, if not matched.
321
316
322
-
## Mutational signatures
323
-
Molecular Oncology Almanac runs [deconstructSigs](https://github.com/raerose01/deconstructSigs) as a subprocess based on the MAF file passed with the input argument `--snv_handle`, performing NMF against the 30 COSMIC v2 signatures.
324
-
325
-
### Trinucleotide context counts
326
-
Filename suffix: `.sigs.context.txt`
327
-
328
-
Trinucleotide context counts of observed somatic variants for all 96 bins are listed in this tab delimited file.
329
-
330
-
### COSMIC signature (v2) weights
331
-
Filename suffix: `.sigs.cosmic.txt`
332
-
333
-
Weights for the 30 COSMIC (v2) mutational signatures are listed in this tab delimited file. Thresholds for a signature to be considered present or not present by Molecular Oncology Almanac are specified in [config.ini](/moalmanac/config.ini) under the `[signatures]` heading.
334
-
335
-
### Trinucleotide context counts image
336
-
Filename suffix: `.sigs.tricontext.counts.png`
337
-
338
-
Trinucleotide context raw counts of observed somatic variants for all 96 bins are visualized in this png file.
Trinucleotide context normalized counts of observed somatic variants for all 96 bins are visualized in this png file.
344
-
345
317
## Preclinical efficacy
346
-
Filename suffix: `.preclinical.efficacy.txt`
318
+
Filename suffix: `.preclinical_efficacy.txt`
347
319
348
320
Therapies listed in [actionable](#actionable) that have been evaluated on cancer cell lines through the Sanger Institute's GDSC are evaluated for efficacy in the presence and absence of the associated molecular feature. This is performed for relationships associated with therapeutic sensitivity. Columns include:
349
321
-`patient_id` (str) - the string associated with the given molecular profile (`--patient_id`)
@@ -396,7 +368,7 @@ Additional equivalent within a provided ontology or stronger matches from anothe
396
368
397
369
For molecular features associated with therapeutic sensitivity that have a therapy evaluated on cancer cell lines, a button `[Preclinical evidence]` will appear below the therapy and rationale which will open a modal to compare the sensitivity to the therapy of interest between mutant and wild type cell lines.
398
370
399
-
Molecular features which are biologically relevant are listed without clinical association. Molecular features will appear here if the associated gene is catalogued in the Molecular Oncology Almanac but under a different feature type, variants are associated with microsatellite stability, and all present COSMIC version 2 mutational signatures not associated with a clinical assertion are reported.
371
+
Molecular features which are biologically relevant are listed without clinical association. Molecular features will appear here if the associated gene is catalogued in the Molecular Oncology Almanac but under a different feature type, variants are associated with microsatellite stability, and all present COSMIC v3.4 mutational signatures not associated with a clinical assertion are reported.
400
372
401
373
The last section of the report, comparison of molecular profile to cancer cell lines, displays results from Molecular Oncology Almanac's patient-to-cell line matchmaking module. **This will not appear in the report if `--disable_matchmaking` is passed as an argument**. The 5 most similar cancer cell lines to the provided profile are listed each listing the cell line name, sensitive therapies from GDSC, and clinically relevant features present. Users can click `[More details]` under each cell line's name for more details about a given cell line: aliases, sensitive therapies, clinically relevant molecular features, all somatic variants, copy number alterations, and fusions occuring in cancer gene census genes, and the 10 most sensitive therapies to the cancer cell line.
0 commit comments