Versão Reduzida no Google Colab:
- Clonar repositório
renatopuga/lmabrasil-hg38 - Instalar
bcftools +split-vep - Instalar
udocker - Filtrar o VCF com
filter_vep:
-filter "(MAX_AF <= 0.01 or not MAX_AF) and
(FILTER = PASS or not FILTER matches strand_bias,weak_evidence) and
(SOMATIC matches 1 or (not SOMATIC and CLIN_SIG matches pathogenic)) and
(not CLIN_SIG matches benign) and \
(not IMPACT matches LOW) and \
(Symbol in hpo/$HPO)
- Filtrar Cobertura Total e Frequência Alélica da variante com:
bcftools +split-vep:
DP>=20 AND AF>=0.1
- Resultado:
*.vep.filter.tsv
- Myelofibrosis: https://hpo.jax.org/app/browse/term/HP:0011974
- Abnormal mast cell morphology: https://hpo.jax.org/app/browse/term/HP:0100494
- Rodar script VEP completo (vep annot + vep_filter)
sh vep.sh WP017 Myelofibrosis.txt
- Rodar script VEP no Google Colab (vep_filter)
sh vep-gc.sh WP017 Myelofibrosis.txt
Sobre a amostra WP017.
- Class: Myelofibrosis
- Information: JAK2-
- Total de variantes no VCF: 7144
- Total de variantes pós filtro: 2
| CHROM | POS | REF | ALT | Location | SYMBOL | Consequence | Feature | MANE_SELECT | BIOTYPE | HGVSc | HGVSp | EXON | INTRON | VARIANT_CLASS | SIFT | PolyPhen | gnomADg_AF | MAX_AF | IMPACT | CLIN_SIG | SOMATIC | Existing_variation | FILTER | TumorID | GT | DP | AD | AF | NormalID | NGT | NDP | NAD | NAF |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| chr17 | 7669662 | T | G | chr17:7669662 | TP53 | missense_variant | NM_000546.6 | ENST00000269305.9 | protein_coding | NM_000546.6:c.1129A>C | NP_000537.3:p.Thr377Pro | 11/11 | . | SNV | tolerated(0.42) | . | 0.000053 | 0.000496 | MODERATE | uncertain_significance | 0&1 | rs774269719&COSV52716766 | base_qual;haplotype;normal_artifact;strand_bias | WP017 | 0|1 | 119 | 101,18 | 0.112 | WP018 | 0|0 | 60 | 55,5 | 0.049 |
| chr19 | 12943750 | AGCAGAGGCTTAAGGAGGAGGAAGAAGACAAGAAACGCAAAGAGGA... | A | chr19:12943751-12943802 | CALR | frameshift_variant | NM_004343.4 | ENST00000316448.10 | protein_coding | NM_004343.4:c.1099_1150del | NP_004334.1:p.Leu367ThrfsTer46 | 9/9 | . | deletion | . | . | 0.000020 | 0.000066 | HIGH | pathogenic | . | rs1555760738 | PASS | WP017 | 0/1 | 102 | 62,40 | 0.416 | WP018 | 0/0 | 50 | 50,0 | 0.022 |