Skip to main content

merlindist

Tags: species identification mash distance classification taxonomy sample-scope

Identify species from assembly and read data using Mash distances.

This subworkflow performs rapid species identification using Mash distance calculations against a reference database. It is a core component of the MERLIN (MinER assisted species-specific bactopia tool seLectIoN) pipeline, responsible for determining which species-specific typing tools should be run based on the detected organism. The workflow outputs channels filtered by detected genera for downstream species-specific analysis.

Take

ch_seqs: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
fnaAssembled contigs in FASTA format for species identification
r1Illumina R1 reads (paired-end) or null
r2Illumina R2 reads (paired-end) or null
seSingle-end Illumina reads or null
lrLong reads (ONT/PacBio) or null
ch_mash_db: Path
NameDescription
mash_dbMash sketch database for rapid species identification

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

OutputDescription
distThe raw Mash distance results
fnaPassthrough of assembled contigs
r1Passthrough of Illumina R1 reads
r2Passthrough of Illumina R2 reads
sePassthrough of single-end reads
lrPassthrough of long reads
escherichiaConditional marker file triggering Escherichia analysis tools
haemophilusConditional marker file triggering Haemophilus analysis tools
klebsiellaConditional marker file triggering Klebsiella analysis tools
legionellaConditional marker file triggering Legionella analysis tools
listeriaConditional marker file triggering Listeria analysis tools
mycobacteriumConditional marker file triggering Mycobacterium analysis tools
neisseriaConditional marker file triggering Neisseria analysis tools
pseudomonasConditional marker file triggering Pseudomonas analysis tools
salmonellaConditional marker file triggering Salmonella analysis tools
staphylococcusConditional marker file triggering Staphylococcus analysis tools
streptococcusConditional marker file triggering Streptococcus analysis tools
genusA marker file indicating the detected genus (for debugging)

run_outputs

No run-scope outputs.

Module Composition

This subworkflow calls the following modules:

  • merlin_dist - Identify species to trigger genus-specific downstream analyses (Merlin).

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub