Skip to content

Commit 568e9bd

Browse files
committed
Render toc-less
1 parent 3b3c1cb commit 568e9bd

File tree

37 files changed

+1023
-447
lines changed

37 files changed

+1023
-447
lines changed

docs/no_toc/01-intro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This course has been developed recently (Summer 2023). We welcome any feedback a
88
## Motivation
99

1010
Identification of neoantigens is a critical step in predicting response to checkpoint blockade therapy and design of personalized cancer vaccines.
11-
This is a cross-disciplinary challenge, involving genomics, proteomics, immunology, and computational approaches. We have built a computational
11+
This is a cross-disciplinary challenge, which involves genomics, proteomics, immunology, and computational approaches. We have built a computational
1212
framework called pVACtools that, when paired with a well-established genomics pipeline, produces an end-to-end solution for neoantigen characterization.
1313
pVACtools supports identification of altered peptides from different mechanisms, including point mutations, in-frame and frameshift insertions and deletions,
1414
and gene fusions. Prediction of peptide:MHC binding is accomplished by supporting an ensemble of MHC Class I and II binding algorithms within a framework

docs/no_toc/02-prerequisites.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -70,36 +70,36 @@ For this course, we have put together a set of input data generated from the bre
7070
cancer cell line HCC1395 and a matched normal lymphoblastoid cell line HCC1395BL.
7171
Data from this cell line is commonly used as test data in bioinformatics applications.
7272
For more information on these lines and the generation of test data, please refer to
73-
the data section of our precision medicine bioinformatics course:
74-
[here](https://pmbio.org/module-02-inputs/0002/05/01/Data/).
73+
the [data section of our precision medicine bioinformatics course](https://pmbio.org/module-02-inputs/0002/05/01/Data/).
7574

7675
The input data consists of the following files:
7776

7877
For pVACseq:
7978

8079
- `annotated.expression.vcf.gz`: A somatic (tumor-normal) VCF and its tbi index file. The VCF has been
8180
annotated with VEP and has coverage and expression information added. It has also been annotated with
82-
custom VEP plugins that provide wild type and mutant version of the full length protein sequences
81+
custom VEP plugins that provide wild type and mutant versions of the full length protein sequences
8382
predicted to arise from each transcript annotated with each variant.
8483
- `phased.vcf.gz`: A phased tumor-germline VCF and its tbi index file to provide information about
8584
in-phase proximal variants that might alter the predicted peptide sequence around a somatic
86-
mutation of interest
87-
- `optitype_normal_result.tsv`: A OptiType file with HLA allele typing predictions
85+
mutation of interest.
86+
- `optitype_normal_result.tsv`: A OptiType file with HLA allele typing predictions.
8887

8988
For more detailed information on how the variant input file is created, please refer to the
9089
[input file preparation](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep.html)
91-
section of the pVACtools docs
90+
section of the pVACtools docs.
9291

9392
For pVACfuse:
9493

95-
- `agfusion_results`: A AGFusion output directory with annotated fusion calls
94+
- `agfusion_results`: An AGFusion output directory with annotated fusion
95+
calls.
9696
- `star-fusion.fusion_predictions.tsv`: A STARFusion prediction file with fusion read support
97-
and expression information
97+
and expression information.
9898

9999
General:
100100

101101
- `Homo_sapiens.GRCh38.pep.all.fa.gz`: A reference proteome peptide FASTA to use
102-
for determining whether there are any reference matches of neoantigen candidates
102+
for determining whether there are any reference matches of neoantigen candidates.
103103

104104
To download this data, please run the following commands:
105105

docs/no_toc/04-outputs.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -93,9 +93,9 @@ patient's RNA.
9393

9494
For pVACseq, this generally relies on your VCF being annotated with coverage
9595
and expression data. In our example, the VCF has already been annotated with
96-
this data. For more information about how to add coverage and expression data
97-
to your own VCFs, please see [here](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep/readcounts.html)
98-
and [here](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep/expression.html).
96+
this data. For more information about how to add [coverage](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep/readcounts.html)
97+
and [expression data](https://pvactools.readthedocs.io/en/latest/pvacseq/input_file_prep/expression.html)
98+
to your own VCFs, please see our docs.
9999
Additionally, filtering on the normal DNA depth and variant allele frequency
100100
(VAF) requires your VCF to be a tumor-normal sample VCF and the normal sample
101101
to be identifies in your pVACseq run using the `--normal-sample-name`
@@ -128,7 +128,7 @@ The following thresholds are applied in pVACfuse by this filter:
128128

129129
### Transcript Support Level Filter
130130

131-
The Transcript Support Level (TSL) Filter, removes neoantigen candidates for
131+
The Transcript Support Level (TSL) Filter removes neoantigen candidates for
132132
transcripts with a high TSL, as defined [by Ensembl](https://grch37.ensembl.org/info/genome/genebuild/transcript_quality_tags.html#tsl).
133133
The cutoff for this filter is set by the `--maximum-transcript-support-level`
134134
parameter. Transcripts with a TSL of NA will always be filtered out.
@@ -145,16 +145,16 @@ The Top Score Filter will attempt to determine the best neoantigen candidate
145145
for each variants.
146146

147147
For pVACseq it works as follows. Given a set of neoantigen candidates for a
148-
variant we first group the transcripts into set where all transcripts in a set
148+
variant we first group the transcripts into sets where all transcripts in a set
149149
code for the same set of neoantigen candidates. For each transcript set we then
150150
determine the best neoantigen candidate as follows:
151151

152152
- Pick all neoantigens with a variant transcript that have a protein_coding Biotype
153153
- Of the remaining candidates, pick the ones with a variant transcript having a
154154
TSL less then the `--maximum-transcript-support-level`.
155-
- Of the remaining candidates, pick the entries with no Problematic Positions
155+
- Of the remaining candidates, pick the entries with no Problematic Positions.
156156
- Of the remaining candidates, pick the ones passing the Anchor Criteria (explained in
157-
more detail further below)
157+
more detail further below).
158158
- Of the remaining candidates, pick the one with the lowest MT IC50 Score (Median or Best
159159
depending on the `--top-score-metric`), lowest TSL, and longest transcript.
160160

@@ -183,10 +183,10 @@ are included in creating this report.
183183

184184
In pVACseq, for each variant, all neoantigen candidates meeting the `--aggregate-inclusion-threshold` are evaluated as follows:
185185

186-
- Pick all entries with a variant transcript that have a protein_coding Biotype
187-
- Of the remaining entries, pick the ones with a variant transcript having a Transcript Support Level <= `--maximum-transcript-support-level`
188-
- Of the remaining entries, pick the entries with no Problematic Positions
189-
- Of the remaining entries, pick the ones passing the Anchor Criteria (see Criteria Details section below)
186+
- Pick all entries with a variant transcript that have a protein_coding Biotype.
187+
- Of the remaining entries, pick the ones with a variant transcript having a Transcript Support Level <= `--maximum-transcript-support-level`.
188+
- Of the remaining entries, pick the entries with no Problematic Positions.
189+
- Of the remaining entries, pick the ones passing the Anchor Criteria (see Criteria Details section below).
190190
- Of the remaining entries, pick the one with the lowest MT IC50 score( Median or Best
191191
depending on the `--top-score-metric`), lowest Transcript Support Level, and longest transcript.
192192

0 commit comments

Comments
 (0)