Skip to content

CanSNPer2 source file format

Andreas Sjödin edited this page Mar 10, 2020 · 1 revision

Files needed to modify database

  1. Tree file (for updating the database)

The CanSNPer2 tree file must contain two columns with child and parent

header row
child B.6
parent B.2
rank species
  1. snp annotation file

The CanSNPer2 snpfile contains all four columns defining a SNP as well as two annotations to the published paper (reference) and the genome_id specified in the genome file. In difference to canSNPer1 the species column is not required as different species are expected to be placed in different databases, (or specified with the --organism parameter might be depricated)

header row
snp_id B.2
strain francisella
reference Birdsell2014
genome FSC200
position <pos>
ancestral_base A
derived_base T

Files needed to create database

  1. Genome file

The CanSNPer2 genomes file contains information about the reference sequences, The file has 7 columns. However only the first column is nessesary for running CanSNPer2, But then the automatic download script won't work and the correct reference sequences has to be supplied manually. Observe that the first row must contain headers

header row
genome_id OSU18
strain OSU18
refseq_id GCA_000014605.1
genbank_id GCF_000014605.1
assembly_name ASM1460v1
refseq_sequence NC_008369.1
genbank_sequence CP000437.1
  1. genome reference file
header row
genome OSU18
strain OSU18
genbank_id GCA_000014605.1
refseq_id GCF_000014605.1
assembly_name ASM1460v1

Clone this wiki locally