You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+39-25Lines changed: 39 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,32 +55,46 @@ See below for a detailed list of available parameters
55
55
56
56
4. The partitioned vcf files are placed under [project_path]/ouput/clustered/ and [project_path]/ouput/nonClustered/. You can visualize the results by looking at the IMD plots available under [project_path]/ouput/plots/.
57
57
58
+
58
59
**AVAILABLE PARAMETERS**
59
60
60
-
Required:
61
-
project: [string] Unique name for the given project.
62
-
genome: [string] Reference genome to use. Must be installed using SigProfilerMatrixGenerator.
63
-
contexts: [string] Contexts needs to be one of the following {“96”, “ID”}.
64
-
simContext: [list of strings] Mutations context that was used for generating the background model (e.g ["6144"] or ["96"]).
65
-
input_path: [string] Path to the given project. Please add a backslash(/) at the end of the input path. For example: "path/to/the/input_file/".
66
-
67
-
Optional:
68
-
analysis: [string] Desired analysis pipeline. By default output_type='all'. Other options include "subClassify" and "hotspot".
69
-
sortSims: [boolean] Option to sort the simulated files if they have already been sorted. By default sortSims=True to ensure accurate results. The files must be sorted for accurate results.
70
-
interdistance: [string] The mutation types to calculate IMDs between - Use only when performing analysis of indels (default='ID').
71
-
calculateIMD: [boolean] Parameter to calculate the IMDs. This will save time if you need to rerun the subclassification step only (default=True).
72
-
max_cpu: [integer] Change the number of allocated CPUs. By default all CPUs are used.
73
-
subClassify: [boolean] Subclassify the clustered mutations. Requires that VAF scores are available in TCGA or Sanger format. By default subClassify=False. See VAF Format below for more details.
74
-
plotIMDfigure: [boolean] Parameter that generates IMD and mutational spectra plots for each sample (default=True).
75
-
plotRainfall [boolean] Parameter that generates rainfall plots for each sample using the subclassification of clustered events (default=True).
76
-
77
-
The following parameters are used if the subClassify argument is True:
78
-
includedVAFs: [boolean] Parameter that informs the tool of the inclusion of VAFs in the dataset (default=True).
79
-
includedCCFs: [boolean] Parameter that informs the tool of the inclusion of CCFs in the dataset (default=True). If CCFs are used, set includedVAFs=False.
80
-
variant_caller: [string] Parameter that informs the tool of what format the VAF scores are provided (default='standard').
81
-
windowSize: [integer] Window size for calculating mutation density in the rainfall plots. By default windowSize=10000000.
82
-
correction [boolean] Optional parameter to perform a genome-wide mutational density correction (boolean; default=False).
83
-
probability [boolean] Optional parameter to calculate the probability of observing each clustered event within the localized region of the genome. These values are saved into the [project_path]/output/clustered/ directories. See OSF wiki page for more details.
|`probability`| Boolean | Calculate the probability of observing each clustered event in its local region. Output saved in `[project_path]/output/clustered/`. Default: `False`. |
84
98
85
99
86
100
**VAF Format**
@@ -93,7 +107,7 @@ If your VAF is recorded in the 11th column of your VCF as the last number of the
93
107
94
108
If your VAF is recorded in the 8th or 10th column of your VCF as VAF=xx or AF=xx, set variant_caller="standard".
95
109
96
-
If your VAF is recorded in the 11th column of your VCF as AF=xx, set variant_caller="mutect2".
110
+
If your VAF is recorded in the 10th or 11th column of your VCF as AF=xx, set variant_caller="mutect2".
97
111
98
112
If your VCFs have no recorded VAFs set includedVAFs=False. This will run SigProfilerClusters, subclassify clusters based on just the calculated IMD (provided that you set subclassify=True).
Copy file name to clipboardExpand all lines: SigProfilerClusters/SigProfilerClusters.py
+5-4Lines changed: 5 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -671,10 +671,11 @@ def analysis(
671
671
max_cpu -> optional parameter to specify the number of maximum cpu's to use for parallelizing the code (integer; default=None: uses all available cpu's)
672
672
subClassify -> optional parameter to subclassify the clustered mutations into refinded classes including DBSs, extended MBSs, kataegis, etc. (boolean; default=False)
673
673
variant_caller -> optional parameter that informs the tool of what format the VAF scores are provided (boolean; default=None). This is required when subClassify=True. Currently, there are four supported formats:
674
-
-> sanger: If your VAF is recorded in the 11th column of your VCF as the last number of the colon delimited values, set variant_caller="sanger".
675
-
-> TCGA: If your VAF is recorded in the 8th column of your VCF as VAF=xx, set variant_caller="TCGA".
676
-
-> standardVC: If your VAF is recorded in the 10th column of your VCF as AF=xx, set variant_caller="standardVC".
677
-
-> mutect2: If your VAF is recorded in the 11th column of your VCF as AF=xx, set variant_caller="mutect2".
674
+
-> caveman: If your VAF is recorded in the 11th column of your VCF as the last number of the colon delimited values, set variant_caller="caveman".
675
+
-> standard: If your VAF is recorded in the 8th or 10th column of your VCF as VAF=xx or AF=xx, set variant_caller="standard".
676
+
-> mutect2: If your VAF is recorded in the 10th or 11th column of your VCF as AF=xx, set variant_caller="mutect2".
677
+
678
+
678
679
includedVAFs -> optional parameter that informs the tool of the inclusion of VAFs in the dataset (boolean; default=True)
679
680
includedCCFs -> optional parameter that informs the tool of the inclusion of cancer cell fractions in the dataset (boolean; default=True)
680
681
windowSize -> the size of the window used for correcting the IMDs based upon mutational density within a given genomic range (integer; default=10000000)
Copy file name to clipboardExpand all lines: SigProfilerClusters/SigProfilerHotSpots.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -510,7 +510,7 @@ def analysis(
510
510
variant_caller -> optional parameter that informs the tool of what format the VAF scores are provided (boolean; default=None). This is required when subClassify=True. Currently, there are four supported formats:
511
511
-> caveman: If your VAF is recorded in the 11th column of your VCF as the last number of the colon delimited values, set variant_caller="caveman".
512
512
-> standard: If your VAF is recorded in the 8th or 10th column of your VCF as VAF=xx or AF=xx, set variant_caller="standard".
513
-
-> mutect2: If your VAF is recorded in the 11th column of your VCF as AF=xx, set variant_caller="mutect2".
513
+
-> mutect2: If your VAF is recorded in the 10th or 11th column of your VCF as AF=xx, set variant_caller="mutect2".
514
514
includedVAFs -> optional parameter that informs the tool of the inclusion of VAFs in the dataset (boolean; default=True)
515
515
windowSize -> the size of the window used for correcting the IMDs based upon mutational density within a given genomic range (integer; default=10000000)
516
516
plotIMDfigure -> optional parameter that generates IMD and mutational spectra plots for each sample (boolean; default=True).
variant_caller -> optional parameter that informs the tool of what format the VAF scores are provided (boolean; default=None). This is required when subClassify=True. Currently, there are four supported formats:
311
311
-> caveman: If your VAF is recorded in the 11th column of your VCF as the last number of the colon delimited values, set variant_caller="caveman".
312
312
-> standard: If your VAF is recorded in the 8th or 10th column of your VCF as VAF=xx or AF=xx, set variant_caller="standard".
313
-
-> mutect2: If your VAF is recorded in the 11th column of your VCF as AF=xx, set variant_caller="mutect2".
313
+
-> mutect2: If your VAF is recorded in the 10th or 11th column of your VCF as AF=xx, set variant_caller="mutect2".
314
314
correction -> optional parameter to perform a genome-wide mutational density correction (boolean; default=False)
0 commit comments