@@ -12,7 +12,6 @@ This chapter will cover:
12
12
- Starting an interactive Docker session
13
13
- Running pVACseq
14
14
- Running pVACfuse
15
- - Understanding pVACtools outputs
16
15
17
16
## Starting Docker
18
17
@@ -40,10 +39,16 @@ to it once you exit the Docker image.
40
39
41
40
## Running pVACseq
42
41
43
- The pVACseq pipeline is run using the ` pvacseq run ` command.
42
+ pVACseq is used to identify neoantigens from missense, inframe indel, and
43
+ frameshift mutations. The pipeline uses a somatic VCF file as an input, which
44
+ represents variants called in the tumor sample. The VEP annoations in the VCF file
45
+ inform the variant type of a variant and their consequence on the gene transcripts
46
+ overlapping the genomic coodinates of the variant. The amino acid change of
47
+ the predicted consequence if used by pVACseq to calculate the mutated peptide sequence.
44
48
49
+ The pVACseq pipeline is run using the ` pvacseq run ` command.
45
50
46
- ### Required Parameters
51
+ ### Required Parameters for pVACseq
47
52
48
53
The ` pvacseq run ` command takes a number of required parameters in the
49
54
following order:
@@ -65,7 +70,7 @@ following order:
65
70
run all available prediction algorithms.
66
71
- ` output_dir ` : The directory for writing all result files.
67
72
68
- ### Optional Parameters
73
+ ### Optional Parameters for pVACseq
69
74
70
75
The ` pvacseq run ` command offers quite a few optional arguments to fine-tune
71
76
your run. Here are a list of parameters we generally recommend:
@@ -122,7 +127,7 @@ on your specific analysis needs:
122
127
123
128
- ` --class-i-epitope-length ` and ` --class-ii-epitope-length ` : By default 8,
124
129
9, 10, 11 and 12, 13, 14, 15, 16, 17, 18 are set for these parameters,
125
- respecitively but different lengths might be desired.
130
+ respectively, but different lengths might be desired.
126
131
- ` --tumor-purity ` : This parameter is used to bin variants into clonal and
127
132
sub-clonal. This parameter might need to be adjusted based on the tumor
128
133
purity of your data.
@@ -140,15 +145,18 @@ on your specific analysis needs:
140
145
expensive. This parameter limits how many amino acids of the downstream
141
146
sequence are included in the prediction.
142
147
148
+ There are additional parameters in pVACseq that we won't discuss at this point
149
+ because the defaults are usually sufficient. To see all available parameters, you can
150
+ run ` pvacseq run -h ` .
151
+
143
152
### pVACseq Command
144
153
145
154
Given the considerations outlined above, let's run pVACseq on our sample data.
146
155
147
- From the
148
- ` optitype_normal_result.tsv ` we know that the patient's class I alleles are HLA-A\* 29:02, HLA-B\* 45:01,
149
- HLA-B\* 82:02, and HLA-C\* 06:02. We also have clinical typing information that confirms
150
- these class I alleles as well as identified DQA1\* 03:03, DQB1\* 03:02, and DRB1\* 04:05 as the
151
- patient's class II alleles.
156
+ From the ` optitype_normal_result.tsv ` we know that the patient's class I alleles are
157
+ HLA-A\* 29:02, HLA-B\* 45:01, HLA-B\* 82:02, and HLA-C\* 06:02. We also have clinical typing
158
+ information that confirms these class I alleles as well as identified DQA1\* 03:03,
159
+ DQB1\* 03:02, and DRB1\* 04:05 as the patient's class II alleles.
152
160
153
161
To identify the tumor and normal sample names we will grep the VCF file for
154
162
the CHROM header:
@@ -161,7 +169,7 @@ This shows that the tumor sample is named `HCC1395_TUMOR_DNA` and the normal sam
161
169
162
170
For our test run, please execute the ` pvacseq run ` command below. The
163
171
prediction run might take a while but pVACseq will output progress messages as
164
- it processeses through the pipeline.
172
+ it runs through the pipeline.
165
173
166
174
``` {r, engine = 'bash', eval = FALSE}
167
175
pvacseq run \
@@ -187,8 +195,117 @@ all \
187
195
188
196
## Running pVACfuse
189
197
190
- ## Understanding pVACtools outputs
198
+ pVACfuse is run to in order to predict neoantigens from fusion events. The
199
+ pipeline uses annotated fusion calls from eithe AGFusion or Arriba for this
200
+ purpose. These annotators already include the fusion peptide sequence in their
201
+ outputs which pVACfuse uses to extract neoantigens around the fusion position.
191
202
192
- This section will review pVACtools outputs and explain how to correctly interpret them.
203
+ The pVACfuse pipeline is run using the ` pvacfuse run ` command.
193
204
205
+ ### Required Parameters for pVACfuse
206
+
207
+ The ` pvacfuse run ` command takes a number of required parameters in the
208
+ following order:
194
209
210
+ - ` input_file ` : An AGFusion output directory or Arriba fusion.tsv output file.
211
+ For the purpose of this course, we will be running pVACfuse with AGFusion
212
+ output.
213
+ - ` sample_name ` : The name of the tumor sample being processed.
214
+ - ` allele(s) ` : The name of the HLA allele to use for epitope prediction. Multiple
215
+ alleles can be specified using a comma-separated list. These should be the
216
+ HLA alleles of your patient. You might have clinical typing information for
217
+ your patient. If not, you will need to computational predict the patient's
218
+ HLA type using software such as OptiType.
219
+ - ` prediction_algorithms ` : The epitope prediction algorithms to use. Multiple
220
+ prediction algorithms can be specified, separated by spaces. Use ` all ` to
221
+ run all available prediction algorithms.
222
+ - ` output_dir ` : The directory for writing all result files.
223
+
224
+ ### Optional Parameters for pVACfuse
225
+
226
+ In addition to the required parameters, the ` pvacseq run ` command also offers
227
+ optional arguments to fine-tune your run. You will find a lot of overlap
228
+ between pVACfuse and pVACseq parameters and the same general considerations
229
+ usually apply. Here are a list of parameters we generally recommend:
230
+
231
+ - ` --starfusion-file ` : Path to a ` star-fusion.fusion_predictions.tsv ` or
232
+ ` star-fusion.fusion_predictions.abridged.tsv ` . This file is used to extract
233
+ read support and expression information.
234
+ - ` --iedb-install-directory ` : For speed and reliability, we generally recommend
235
+ that users use a standalone installation of the IEDB software. The pVACtools
236
+ Docker containers already come with this software pre-installed in the
237
+ ` /opt/iedb ` directory.
238
+ - ` --allele-specific-binding-thresholds ` : When filtering and tiering
239
+ neoantigen candidates, one main criteria is the predicted peptide-MHC
240
+ binding affinity. By default, pVACfuse uses a cutoff of <500 nmol IC50.
241
+ However, for some HLA alleles, other cutoffs are more appropriate depending
242
+ on the distribution of binding affinities across peptides. Setting
243
+ this flag enables allele-specific binding cutoffs as recommended by
244
+ [ IEDB] ( https://help.iedb.org/hc/en-us/articles/114094152371-What-thresholds-cut-offs-should-I-use-for-MHC-class-I-and-II-binding-predictions ) .
245
+ - ` --run-reference-proteome-similarity ` : One consideration when selecting
246
+ neoantigen candidates, is that the neoantigen should not occur natively in
247
+ the patient's proteome. When this flag is set, pVACfuse will search for each
248
+ neoantigen candidate in the reference proteome and report any hits found.
249
+ By default this is done using BLASTp but we recommend using a proteome FASTA
250
+ file via the ` --peptide-fasta ` parameter to speed up this step.
251
+ - ` --percentile-threshold ` : When considering the peptide-MHC binding affinity
252
+ for filtering and prioritizing neoantigen candidates, by default only the
253
+ IC50 value is being used. Setting this parameter will additional also filter
254
+ on the predicted percentile. We recommend a value of 0.01 (1%) for this
255
+ threshold.
256
+
257
+ Additionally there are a number of parameters that might be useful depending
258
+ on your specific analysis needs:
259
+
260
+ - ` --class-i-epitope-length ` and ` --class-ii-epitope-length ` : By default 8,
261
+ 9, 10, 11 and 12, 13, 14, 15, 16, 17, 18 are set for these parameters,
262
+ respectively, but different lengths might be desired.
263
+ - ` --problematic-amino-acids ` : Some vaccine manufacturers will consider certain amino
264
+ acids in the neoantigen candidates difficult to manufacture. For example, a
265
+ Cysteine is commonly considered problematic as it makes the peptide
266
+ unstable. This parameter allows users to set their own rules as to which
267
+ peptides are considered problematic and peptides meeting those rules will be marked in the
268
+ pVACseq results and deprioritized.
269
+ - ` --threads ` : This argument will allow pVACfuse to run in multi-processing
270
+ mode.
271
+ - ` --keep-tmp-files ` : Setting this flag will save intermediate files created by pVACfuse.
272
+ - ` --downstream-sequence-length ` : For frameshift fusions, the downstream
273
+ sequence can potentially be very long, which can be computationally
274
+ expensive. This parameter limits how many amino acids of the downstream
275
+ sequence are included in the prediction.
276
+
277
+ ### pVACfuse Command
278
+
279
+ Given the considerations outlined above, let's run pVACfuse on our sample data.
280
+
281
+ As with pVACseq, we can use the ` optitype_normal_result.tsv ` file to identify the patient's
282
+ class I HLA alleles. These are HLA-A\* 29:02, HLA-B\* 45:01, HLA-B\* 82:02, and HLA-C\* 06:02.
283
+ We also have clinical typing information that confirms these class I alleles as well as
284
+ identified DQA1\* 03:03, DQB1\* 03:02, and DRB1\* 04:05 as the patient's class II alleles.
285
+
286
+ For pVACfuse the sample name is not used for any parsing so it doesn't need to
287
+ match any specific information in the AGFusion results. It is only used for
288
+ naming result files. For consistency we will use the same ` HCC1395_TUMOR_DNA `
289
+ sample name we used in pVACfuse.
290
+
291
+ For our test run, please execute the ` pvacfuse run ` command below. The
292
+ prediction run might take a while but pVACfuse will output progress messages as
293
+ it runs through the pipeline.
294
+
295
+ ``` {r, engine = 'bash', eval = FALSE}
296
+ pvacfuse run \
297
+ /HCC1395_inputs/agfusion_results \
298
+ HCC1395_TUMOR_DNA \
299
+ HLA-A*29:02,HLA-B*45:01,HLA-B*82:02,HLA-C*06:02,DQA1*03:03,DQB1*03:02,DRB1*04:05 \
300
+ all \
301
+ /pVACtools_outputs/pvacfuse_predictions \
302
+ --iedb-install-directory /opt/iedb \
303
+ --allele-specific-binding-thresholds \
304
+ --percentile-threshold 0.01 \
305
+ --run-reference-proteome-similarity \
306
+ --peptide-fasta /HCC1395_inputs/Homo_sapiens.GRCh38.pep.all.fa.gz \
307
+ --problematic-amino-acids C \
308
+ --downstream-sequence-length 100 \
309
+ --n-threads 8 \
310
+ --keep-tmp-files
311
+ ```
0 commit comments