-
Notifications
You must be signed in to change notification settings - Fork 2
CanSNPer2 source file format
- Tree file (for updating the database)
The CanSNPer2 tree file must contain two columns with child and parent
| header | row |
|---|---|
| child | B.6 |
| parent | B.2 |
| rank | species |
- snp annotation file
The CanSNPer2 snpfile contains all four columns defining a SNP as well as two annotations to the published paper (reference) and the genome_id specified in the genome file. In difference to canSNPer1 the species column is not required as different species are expected to be placed in different databases, (or specified with the --organism parameter might be depricated)
| header | row |
|---|---|
| snp_id | B.2 |
| strain | francisella |
| reference | Birdsell2014 |
| genome | FSC200 |
| position | <pos> |
| ancestral_base | A |
| derived_base | T |
- Genome file
The CanSNPer2 genomes file contains information about the reference sequences, The file has 7 columns. However only the first column is nessesary for running CanSNPer2, But then the automatic download script won't work and the correct reference sequences has to be supplied manually. Observe that the first row must contain headers
| header | row |
|---|---|
| genome_id | OSU18 |
| strain | OSU18 |
| refseq_id | GCA_000014605.1 |
| genbank_id | GCF_000014605.1 |
| assembly_name | ASM1460v1 |
| refseq_sequence | NC_008369.1 |
| genbank_sequence | CP000437.1 |
- genome reference file
| header | row |
|---|---|
| genome | OSU18 |
| strain | OSU18 |
| genbank_id | GCA_000014605.1 |
| refseq_id | GCF_000014605.1 |
| assembly_name | ASM1460v1 |