Skip to main content

bactopia_sketcher

Tags: taxonomy classification minhash sketch mash sourmash refseq gtdb sample-scope

Create genomic sketches and perform rapid taxonomic classification.

This subworkflow generates MinHash sketches from assembled genomes using Mash and Sourmash. The sketches are compared against reference databases to identify taxonomic classification and find closely related genomes. Mash queries against RefSeq while Sourmash uses the GTDB database for comprehensive taxonomic placement.

Take

assembly: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
assemblyAssembled contigs in FASTA format
mash_db: Path
sourmash_db: Path
NameTypeDescription
mash_dbPathPath to the Mash RefSeq database for taxonomic classification
sourmash_dbPathPath to the Sourmash GTDB LCA database for taxonomic classification

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

OutputDescription
sigSourmash signature file
mshMash sketch files for k=21 and k=31
mashMash Screen classification report against RefSeq
sourmashSourmash LCA classification report against GTDB

run_outputs

No run-scope outputs.

Module Composition

This subworkflow calls the following modules:

  • bactopia_sketcher - Create genomic sketches and perform rapid taxonomic classification.

Used By

This subworkflow is used by the following workflows:

  • bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
  • staphopia - Comprehensive analysis pipeline for Staphylococcus aureus isolates.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub