Skip to content

5. FAQs and Common Errors

Thom Booth edited this page Nov 11, 2025 · 5 revisions

Read this section if you are running into problems using getphylo. If you still can't fix your problem, please submit a help request.

FAQs

"What is the difference between --ignore-bad-annotations and --ignore-bad-records?

Often, when working with data from public databases, annotations will be inconsistent or incomplete and these flags help to deal with these scenarios. A common example is pseudogenes, which are often annotated as CDS, but have no translations. It is important that users are aware of these quirks of their data so, by default, getphylo will break on these annotations. To get around this we added -ia and -ir parameters. First, -ia will ignore annotations that would otherwise trip an error. This will allow you to skip over these genes but still include you genome. Second -ir will ignore the whole genome if it contains 'bad' annotations. Please note that -ir requires that you also set -ia. IMPORTANT: These flags are marked as 'NOT RECCOMMENDED' as it is important that you understand exactly what is being missed when you use these flags. Please, use with caution and take extra steps to confirm the validity of your final results!

Common Errors

1. "I keep getting an unrecognized argument error!"

This is a common error in getphylo if you forget to wrap your input string in quotations. This is because bash parses the wildcard character (*) into separate inputs. This means if you write getphylo -g *.gb the terminal will interpret this as getphylo -g file1.gb file2.gbk file3.gb. As file2.gb and file3.gb are not arguments for getphylo the terminal will produce and unrecognized argument error. Also note, Mac's use and older version of bash so you need to ensure that you use double quotations (") to avoid this error.

2. "Getphylo is telling me I don't have any files - but I have files!"

This error occurs when getphylo's search string does not find and files. By default getphylo's search string is '*.gbk'. This means it will search the current directory for all files ending in '.gbk'. You can see what getphylo see's by using ls (e.g. ls *.gbk.)

3. "Getphylo keeps telling me that files were not parsed correctly!"

Inconsistency in annotation is the biggest cause of errors in getphylo. If you want more details on what in particular went wrong, run getphylo -l INFO.

The most common cause of this error is that the CDS annotation getphylo is searching for is not present. For example: WARNING : Missing locus_tag in BGC0000001. In this case, you need to specify an annotation that is present in all files using -t. In a pinch, you can usually switch between locus_tag and protein_id safely (e.g. sed -i 's/locus_tag/protein_id/g' my_file.gbk), however this may still cause some problems. Ideally you should make sure that all your data is uniformly annotated. Unfortunately, we cannot do this for you!

4. "MUSCLE isn't working!"

Older versions of getphylo do not support MUSCLE v5. If you try and run getphylo <v 1.0.1 with MUSCLE v5 it will produce a RuntimeError. We have now added MUSCLE5 functionality, so please update getphylo to the latest version and try again. If the error persists, please contact us.

5. "Getphylo says my output directory already exists!"

For safety, getphylo will not overwrite any existing directories. This means that if an 'output' directory already exists by default getphylo will not run. To avoid this error you should: To avoid this you should either:

a. delete the output directory before starting the run (rm -r output; getphylo) or; b. point getphylo to a new output directory: getphylo -o new_output_dir.