reform:
Convert basecaller's move table to ss string format.realign:
Realign signal to reference using cigar string and the move table.plot:
Plot read/reference - signal alignments.plot_pileup:
Plot a reference - signal alignment pileup.plot_tracks:
Plot multiple reference - signal alignment pileup tracks.calculate_offsets:
A utility program to calculate the most significant base index given a kmer model or a read - signal alignment.metric:
A utility program to calculate some statistics of the signal alignment.
squigualiser reform [OPTIONS] -c --bam basecall_moves.sam -o reform.paf
squigualiser reform [OPTIONS] --bam basecall_moves.sam -o reform.tsv
Convert basecaller's move table to ss string format. For more information refer reform.
--bam FILE:
The basecaller's move table output in sam or bam format. Guppy outputs multiple sam/bam files. These files have to be merged to create a single sam/bam file.-o, --output FILE:
Specifies name/location of the output file. A valid relative or absolute path can be provided. Data will be overwritten. The extension should be either.tsvor.paf.-k, --kmer_length INT:
(optional) kmer length to consider for the calculation.-m, --sig_move_offset INT:
(optional) signal move offset to consider for the calculation.-c:
(optional) write move table in paf format. [default is tsv]--rna:
(optional) Specify if RNA reads are used. By default, only DNA is accepted [default value: false]. The tool will not error out if the data is RNA but if the user failed to specify it. It will still generate an incorrect output.--profile STR:
(optional) This is used to determine -k and -m values using preset values. The available profiles can be listed using--list_profile--list_profile:
(optional) Print the available profiles and exit.
squigualiser realign [OPTIONS] -c --bam minimap2.sam --paf reform_output.paf -o realign_output.paf
squigualiser realign [OPTIONS] -c --bam minimap2.bam --paf reform_output.paf -o realign_output.paf
squigualiser realign [OPTIONS] --bam minimap2.bam --paf reform_output.paf -o realign_output.sam
squigualiser realign [OPTIONS] --bam minimap2.bam --paf reform_output.paf -o realign_output.bam
Realign signal to reference using cigar string and the move table. For more information refer realign.
--paf FILE:
The read-signal alignment file generated usingreformwith-coption.--bam FILE:
The read-reference alignment sam/bam file containing cigar strings (minimap2 sam output).-o, --output FILE:
Specifies name/location of the output file. A valid relative or absolute path can be provided. Data will be overwritten. The extension should be.sam,.bam, or.paf.-c:
(optional) write move table in paf format. [default is sam or bam depending on the extension provided]--rna:
(optional) Specify if RNA reads are used. By default, only DNA is accepted [default value: false].
squigualiser plot [OPTIONS] -f reads.fasta -s reads.slow5 -a reform.paf -o output_dir
squigualiser plot [OPTIONS] -f reads.fastq -s reads.slow5 -a reform.paf -o output_dir
squigualiser plot [OPTIONS] -f reads.fasta -s reads.slow5 -a realign.sam -o output_dir
squigualiser plot [OPTIONS] -f reads.fasta -s reads.slow5 -a realign.paf.gz -o output_dir
Plot read/reference - signal alignments.
-f, --file FILE:
The sequence file infasta/fa/fastq/fq/fq.gzformat.-r, --read_id STR(optional) Plot only the read with the read id specified.-s, --slow5 FILEPath to slow5 file containing raw signals.-a, --alignment FILEFor read-signal alignment plots provide the path to.paffile generated usingreform. For reference-signal alignment plots provide the.sam/.bamor.paf.gzfile (generated usingrealginorf5c eventalign).--region STR[start-end] 1-based closed interval region to plot. For read-signal alignment eg:100-200. For reference-signal alignment eg:chr1:6811428-6811467orchr1:6,811,428-6,811,467.-o, --output_dir DIR:
Specifies name/location of the output directory. A valid relative or absolute path can be provided. Data will be overwritten but the directory will not be recreated.--tag_name STR(optional) A tag name to easily identify the plot.-k, --kmer_length INT(optional) kmer length to consider for the base shift calculation and plotting.--plot_reverse(optional) Plot only the reverse mapped reads [default value: false].--rna(optional) Specify if RNA reads are used. By default, only DNA is accepted [default value: false].--base_limit INT(optional) Maximum number of bases to plot.--sig_ref(optional) Plot signal to reference mapping. Can be mostly discarded. Act as a flag to avoid ploting reference-signal alignment using a.pafalignment file [default value: false].--fixed_width(optional) Plot with fixed base width. By default, the base widht is stretched to have equal distance between the sample points. By activating this flag, sample points of a base will be squeezed to a fixed width. [default value: false]--sig_scale(optional) Plot the scaled signal. By default, the signal is not scaled but converted to pA values. Supported scalings are: [medmad, znorm, scaledpA].medmadis median absolute deviation scaling.
znormis zscore normalization scaling.scaledpAusesscandshtags to scale the raw signal to the pore model. The implementation of each method can be found atsrc/plot_utils.py/scale_signal()--no_pa(optional) Do not convert the signal to pA levels. By default, the raw signal is converted to pA levels [default value: false].--loose_bound(optional) Also plot alignments not completely within the specified region but at least part is [default value: false].--point_size INT(optional) Radius of the signal point drawn in the plot [default value: 0.5].--base_width INT(optional) The base width when plotting with fixed base width [default value: 10].--base_shift INT(optional) The number of bases to shift to align fist signal move [default value: 0]. More information on this can be found at here--auto(optional) Calculate and automatically set the base shift. [default value: false]--profile STR:
(optional) This is used to determine base shift using preset values. The available profiles can be listed using--list_profile. The precedence is in the order of--base_shift < --auto < --profile.--list_profile:
(optional) Print the available profiles and exit.--plot_limit INT(optional) the number of plots to be generated [default value: 1000].--sig_plot_limit INT(optional) The maximum number of signal samples to draw on a plot [default value: 20000].--bed FILE(optional) The bed file with annotations.--no_colours(optional) hide base colours [default value: false].--no_samples(optional) hide sample points [default value: false].--save_svg(optional) save as svg. tweak --region and --xrange to capture the necessary part of the plot [default value: false].--xrange(optional) initial x range [default value: 350].--remove_signal_outliers(optional) remove signal outliers that are outside the raw value range [0, 2000].
squigualiser plot_pileup [OPTIONS] -f genome.fasta -s reads.slow5 -a realign.sam -o output_dir --region chr1:20000-20400
squigualiser plot_pileup [OPTIONS] -f genome.fasta -s reads.slow5 -a realign.paf.gz -o output_dir --region chr1:20000-20400
Plot reference - signal alignment pileups.
-f, --file FILE:
The sequence file infasta/fa/fastq/fq/fq.gzformat.-s, --slow5 FILEPath to slow5 file containing raw signals.-a, --alignment FILEProvide the.sam/.bamor.paf.gzfile (generated usingrealginorf5c eventalign).--region STR[start-end] 1-based closed interval region to plot. For read-signal alignment eg:100-200. For reference-signal alignment eg:chr1:6811428-6811467orchr1:6,811,428-6,811,467.-o, --output_dir DIR:
Specifies name/location of the output directory. A valid relative or absolute path can be provided. Data will be overwritten but the directory will not be recreated.-r, --read_id STR(optional) Plot only the read with the read id specified.-l, --read_list STR(optional) Path to a file containing a list of read_ids to plot only the reads listed. The file can have two tab separated columns (readid and colour)--tag_name STR(optional) A tag name to easily identify the plot-k, --kmer_length INT(optional) kmer length to consider for the base shift calculation and plotting.--plot_reverse(optional) Plot only the reverse mapped reads.--rna(optional) Specify if RNA reads are used. By default, only DNA is accepted.--base_limit INT(optional) Maximum number of bases to plot.--sig_scale(optional) Plot the scaled signal. By default, the signal is not scaled but converted to pA values. Supported scalings are: [medmad, znorm, scaledpA].medmadis median absolute deviation scaling.
znormis zscore normalization scaling.scaledpAusesscandshtags to scale the raw signal to the pore model. The implementation of each method can be found atsrc/plot_utils.py/scale_signal()--no_pa(optional) Do not convert the signal to pA levels. By default, the raw signal is converted to pA levels [default value: false].--point_size INT(optional) Radius of the signal point drawn in the plot [default value: 0.5].--base_width INT(optional) The base width when plotting with fixed base width [default value: 10].--base_shift INT(optional) The number of bases to shift to align fist signal move [default value: 0]. More information on this can be found at here--auto(optional) Calculate and automatically set the base shift. [default value: false]--profile STR:
(optional) This is used to determine base shift using preset values. The available profiles can be listed using--list_profile. The precedence is in the order of--base_shift < --auto < --profile.--list_profile:
(optional) Print the available profiles and exit.--plot_limit INT(optional) the number of plots to be generated [default value: 1000].--sig_plot_limit INT(optional) The maximum number of signal samples to draw on a plot [default value: 20000].--bed FILE(optional) The bed file with annotations.--plot_num_samples(optional) Annotate the number of samples for each move. By default, it is not annotated. Annotation can make the plots less responsive.--overlap_bottom(optional) Plot the overlap at the bottom. By default, it is plotted at the top [default value: false].--no_overlap(optional) Do not plot the overlap. By default, it is plotted at the top [default value: false].--overlap_only(optional) Plot only the overlap [default value: false].--cprofile(optional) Create a log file using python cprofile profiler [default value: false].--return_plot(optional) Return the plot object without saving to output. This is used inplot_trackstool [default value: false]. Cannot be used in conjunction with--output_dir.--no_colours(optional) hide base colours [default value: false].--save_svg(optional) save as svg. tweak --region and --xrange to capture the necessary part of the plot [default value: false].--xrange(optional) initial x range [default value: 350].--remove_signal_outliers(optional) remove signal outliers that are outside the raw value range [0, 2000].
squigualiser plot_tracks [OPTIONS] -f commands.txt -s reads.slow5 -a reform.paf -o output_dir
Plot multiple reference - signal alignment pileup tracks.
-f, --file FILE:
The file that contains thepileup_plotcommands.-o, --output_dir DIR:
Specifies name/location of the output directory. A valid relative or absolute path can be provided. Data will be overwritten but the directory will not be recreated.--tag_name STR(optional) A tag name to easily identify the plot--shared_x(optional) Share x-axis so that all the plots move together [default value: false].--auto_height(optional) Adjust track height automatically using the number of plots available in each track [default value: false].
squigualiser calculate_offsets [OPTIONS] -f reads.fasta -s reads.slow5 -a reform.paf -o output_dir
A utility program to calculate the most significant base index given a kmer model or a read - signal alignment.
-k, --kmer_length INT:
(optional) kmer length to consider for the calculation.--paf FILE:
The read-signal alignment file generated usingreformwith-coption.-f, --file FILE:
The sequence file infasta/fa/fastq/fq/fq.gzformat.-s, --slow5 FILEPath to slow5 file containing raw signals.--model(optional) Path to the kmer model file if--use_modelis true.--use_model(optional) Calculate offset using the model file [default value: false]. By default, calculation is done for the reads using move table annotation.--read_limit(optional) Limit the number of reads considered for the calculation [default value: 1000].-r, --read_id STR(optional)Plot only the read with the read id specified.-o, --output FILE:
Specifies name/location of the output file. A valid relative or absolute path can be provided. Data will be overwritten. The extension should be.pdf. Output file can be used in--use_modelmode or when a read id is specified (--read_id).--tag_name STR(optional) A tag name to easily identify the plot--rna:
(optional) Specify if RNA reads are used. By default, only DNA is accepted [default value: false]. The tool will not error out if the data is RNA but if the user failed to specify it. It will still generate an incorrect output.
squigualiser metric [OPTIONS] -f reads.fasta -s reads.slow5 -a reform.paf -o output.tsv
squigualiser metric [OPTIONS] -f reads.fastq -s reads.slow5 -a reform.paf -o output.tsv
squigualiser metric [OPTIONS] -f reads.fasta -s reads.slow5 -a realign.sam -o output.tsv
squigualiser metric [OPTIONS] -f reads.fasta -s reads.slow5 -a realign.paf.gz -o output.tsv
Parse the ss string to calculate basic statistics of the read/reference - signal alignments.
The arguments are almost the same as for plot and plot_pileup tools.
Instead of generating figures metric will generate statistics after parsing the ss string.
These statistics are written to the output file.
-f, --file FILE:
The sequence file infasta/fa/fastq/fq/fq.gzformat.-r, --read_id STR(optional) Plot only the read with the read id specified.-s, --slow5 FILEPath to slow5 file containing raw signals.-a, --alignment FILEFor read-signal alignment plots provide the path to.paffile generated usingreform. For reference-signal alignment plots provide the.sam/.bamor.paf.gzfile (generated usingrealginorf5c eventalign).--region STR[start-end] 1-based closed interval region to plot. For read-signal alignment eg:100-200. For reference-signal alignment eg:chr1:6811428-6811467orchr1:6,811,428-6,811,467.-o, --output FILE:
Specifies name/location of the output file. A valid relative or absolute path can be provided. Data will be overwritten. These statistics are written to the output file.--tag_name STR(optional) A tag name to easily identify the plot.--plot_reverse(optional) Plot only the reverse mapped reads [default value: false].--rna(optional) Specify if RNA reads are used. By default, only DNA is accepted [default value: false].--sig_ref(optional) Plot signal to reference mapping. Can be mostly discarded. Act as a flag to avoid ploting reference-signal alignment using a.pafalignment file [default value: false].--sig_scale(optional) Plot the scaled signal. By default, the signal is not scaled but converted to pA values. Supported scalings are: [medmad, znorm, scaledpA].medmadis median absolute deviation scaling.
znormis zscore normalization scaling.scaledpAusesscandshtags to scale the raw signal to the pore model. The implementation of each method can be found atsrc/plot_utils.py/scale_signal()--no_pa(optional) Do not convert the signal to pA levels. By default, the raw signal is converted to pA levels [default value: false].--loose_bound(optional) Also plot alignments not completely within the specified region but at least part is [default value: false].--base_shift INT(optional) The number of bases to shift to align fist signal move [default value: 0]. More information on this can be found at here--profile STR:
(optional) This is used to determine base shift using preset values. The available profiles can be listed using--list_profile--list_profile:
(optional) Print the available profiles and exit.--plot_limit INT(optional) the number of plots to be generated [default value: 1000].--sig_plot_limit INT(optional) The maximum number of signal samples to draw on a plot [default value: 20000].--remove_signal_outliers(optional) remove signal outliers that are outside the raw value range [0, 2000].
squigualiser plot_signal [OPTIONS] -r reads_id_ -s reads.slow5 -o output_dir
Plot signal for a read_id.
-r, --read_id STRPlot the read id specified.-s, --slow5 FILEPath to slow5 file containing raw signals.--region STR[start-end] 1-based closed interval region to plot. eg:100-200.-o, --output_dir DIR:
Specifies name/location of the output directory. A valid relative or absolute path can be provided. Data will be overwritten but the directory will not be recreated.--tag_name STR(optional) A tag name to easily identify the plot.--sig_scale(optional) Plot the scaled signal. By default, the signal is not scaled but converted to pA values. Supported scalings are: [medmad, znorm, scaledpA].medmadis median absolute deviation scaling.
znormis zscore normalization scaling. The implementation of each method can be found atsrc/plot_utils.py/scale_signal()--reverse_signal(optional) reverse the signal [default value: false].--no_pa(optional) Do not convert the signal to pA levels. By default, the raw signal is converted to pA levels [default value: false].--remove_signal_outliers(optional) remove signal outliers that are outside the raw value range [0, 2000].--point_size INT(optional) Radius of the signal point drawn in the plot [default value: 0.5].--sig_plot_limit INT(optional) The maximum number of signal samples to draw on a plot [default value: 20000].--no_samples(optional) hide sample points [default value: false].--save_svg(optional) save as svg. tweak --region and --xrange to capture the necessary part of the plot [default value: false].--xrange(optional) initial x range [default value: 350].