Skip to content

PREFIX.pergeno.aa_mutations.csv

Xiaolong Cao edited this page Dec 21, 2020 · 5 revisions

This file includes amino acid changing annotations for each proteins.

files are in csv format (with \t as separator). Transposed table looks like:

protein_id ENSP00000437362 ENSP00000446015 ENSP00000450505 ENSP00000451203 ENSP00000452431
protein_id_fasta ENSP00000437362.1 ENSP00000446015.1 ENSP00000450505.1 ENSP00000451203.1 ENSP00000452431.1
seqname 14 14 14 14 14
strand + + + + +
frameChange
stopGain
AA_stopGain
stopLoss
stopLoss_pos
n_variant_AA 1.0 2.0 2.0 1.0 1.0
n_deletion_AA
n_insertion_AA
variant_AA F70S(14-21888371-T-C) P50Q(14-21924450-C-A);Q76E(14-21924527-C-G) E77K(14-21979008-G-A);S78G(14-21979011-A-G) S103L(14-22086905-C-T) T25P(14-22124030-A-C)
insertion_AA
deletion_AA
len_ref_AA 113 116 113 121 109
len_alt_AA 113.0 116.0 113.0 121.0 109.0

Columns:

  • protein_id: protein_id used in perGeno analysis
  • protein_id_fasta: protein id that is stored in fasta file
  • seqname: chromosome name
  • strand: strand of proteins in chromosome
  • frameChange: if there is a frame change mutation
  • stopGain: if there is a stopgain mutation
  • AA_stopGain: amino acid (AA) that is mutated to a stop codon
  • stopLoss: if there is a stoploss mutation
  • stopLoss_pos: position of stoploss in protein sequence
  • nonStandardStopCodon: for proteins with mutations, if translation is stop and a position that is not a stop codon
  • n_variant_AA: count of AA substitution
  • n_deletion_AA: count of AA deletion
  • n_insertion_AA: count of AA insertion
  • variant_AA: substituted AAs
  • insertion_AA: inserted AAs
  • deletion_AA: deleted AAs
  • len_ref_AA: length of provided protein
  • len_alt_AA: length of changed protein

Note: some of those columns may be empty.

Clone this wiki locally