@@ -123,6 +123,8 @@ same loci, the phenotype will instead be predicted as 'O1ab'.
123123This logic is applied during the :ref: `phenotype prediction <Phenotype-prediction >` step of typing and is reported in
124124the `Type ` column of the Kaptive tabular output.
125125
126+ .. _Distributed-databases :
127+
126128Databases distributed with Kaptive
127129====================================
128130
@@ -331,51 +333,29 @@ Database Keywords
331333
332334Extract
333335====================================
334- Kaptive 3.0.0 and above include a new command-line option ``-- extract `` that allows you to extract features
336+ Kaptive 3.0.0 and above includes a new command-line mode ``extract `` that allows you to extract features
335337from a Kaptive database in the following formats:
336338
337- * **loci **: Locus nucleotide sequences in fasta (fna) format.
338- * **genes **: Gene nucleotide sequences in fasta (fna) format.
339- * **proteins **: Protein sequences in fasta (faa) format.
340- * **genbank **: Genbank format (same as input but optionally filtered).
341-
342- Simple usage is as follows::
343-
344- kaptive extract <db> <format> [options]
345-
346- For example, If I wanted to extract the gene nucleotide sequences from the *Klebsiella pneumoniae * K locus primary
347- reference database in fasta format, I would run::
348-
349- kaptive extract kp_k loci > k_loci.fna
350-
351- If I wanted to extract all protein sequences from KL1 and KL2, I would run::
339+ * **fna **: Locus nucleotide sequences in fasta format.
340+ * **ffn **: Gene nucleotide sequences in fasta format.
341+ * **faa **: Protein sequences in fasta format.
352342
353- kaptive extract kp_k proteins --filter "^KL(1|2)$" > KL1_KL2_proteins.faa
343+ Usage
344+ ----------
345+ General usage is as follows::
354346
355- If I wanted to do the same but output each locus to a separate file, I would run::
347+ kaptive extract <db> [formats] [options]
356348
357- kaptive extract kp_k proteins --filter "^KL(1|2)$" -d KL1_KL2_proteins
358-
359- Which would create two files: ``KL1.faa `` and ``KL2.faa ``.
360-
361- Inputs::
362-
363- db path/keyword Kaptive database path or keyword
364- format Format to extract database
365- - loci: Loci (fasta nucleotide)
366- - genes: Genes (fasta nucleotide)
367- - proteins: Proteins (fasta amino acid)
368- - genbank: Genbank format
369-
370- .. note ::
371- Combine with ``--filter `` to select loci
349+ Formats::
372350
373- Output options::
351+ Note, text outputs accept '-' for stdout
374352
375- -o , --out Output file to write/append loci to (default: stdout)
376- -d , --outdir Output directory for converted results
377- - Note: This forces the output to be written to files (instead of stdout)
378- and one file will be written per locus.
353+ --fna [] Convert to locus nucleotide sequences in fasta format
354+ Accepts a single file or a directory (default: cwd)
355+ --ffn [] Convert to locus gene nucleotide sequences in fasta format
356+ Accepts a single file or a directory (default: cwd)
357+ --faa [] Convert to locus gene protein sequences in fasta format
358+ Accepts a single file or a directory (default: cwd)
379359
380360.. _Database-options :
381361
@@ -394,3 +374,27 @@ Other options::
394374 -V, --verbose Print debug messages to stderr
395375 -v , --version Show version number and exit
396376 -h , --help Show this help message and exit
377+
378+ For example, to extract the gene nucleotide sequences from the *Klebsiella pneumoniae * K locus primary
379+ reference database in fasta format, run::
380+
381+ kaptive extract kp_k --fna k_loci.fna
382+
383+ To extract all protein sequences from KL1 and KL2, run either one of the following::
384+
385+ kaptive extract kp_k --filter "^KL(1|2)$" --faa KL1_KL2_proteins.faa
386+ kaptive extract kp_k --filter "^KL(1|2)$" --faa - > KL1_KL2_proteins.faa
387+
388+ To do the same but output each locus to a separate file, run either::
389+
390+ kaptive extract kp_k --filter "^KL(1|2)$" --faa
391+ kaptive extract kp_k --filter "^KL(1|2)$" --faa protein_files/
392+
393+ Which would create two files: ``KL1.faa `` and ``KL2.faa ``.
394+
395+ kaptive assembly kpsc_k assembly.fasta -j kaptive_results.json
396+
397+ .. warning ::
398+ It is possible to write **all ** text formats (FNA, FAA and FFN) to the same file (including stdout), however
399+ this is not recommended for downstream analysis.
400+
0 commit comments