merlin
Tags: species identification typing serotype virulence sample-scope
MinER assisted species-specific bactopia tool seLectIoN.
This subworkflow performs intelligent species identification and selects appropriate species-specific typing tools based on the detected organism. It first identifies potential species using MinHash distance estimation, then runs species-specific subworkflows for detailed characterization including serotyping, MLST, virulence factor detection, and antimicrobial resistance profiling.
Take
assembly: Channel<Record>
| Field | Description |
|---|---|
meta | Groovy Record containing sample information |
fna | Assembly file for species identification and typing |
r1 | Illumina R1 reads (paired-end) or null |
r2 | Illumina R2 reads (paired-end) or null |
se | Single-end Illumina reads or null |
lr | Long reads (ONT/PacBio) or null |
mash_db: Path
emmtyper_blastdb: Path?
hicap_database_dir: Path?
hicap_model_fp: Path?
staphtyper_repeats: Path?
staphtyper_repeat_order: Path?
| Name | Type | Description |
|---|---|---|
mash_db | Path | Mash sketch database for rapid species identification |
emmtyper_blastdb | Path? | EMMTyper BLAST database for Streptococcus pyogenes emm typing (optional) |
hicap_database_dir | Path? | HiCAP database directory for Haemophilus influenzae serotyping (optional) |
hicap_model_fp | Path? | HiCAP HMM model file for improved detection (optional) |
staphtyper_repeats | Path? | Staphylococcus aureus repeat sequences for spa typing (optional) |
staphtyper_repeat_order | Path? | Staphylococcus aureus repeat order file for spa typing (optional) |
Emit
Published
The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.
sample_outputs
Mixed per-sample records from merlindist and all activated species-specific typing subworkflows (e.g., ectyper, sistr, kleborate). Each record carries tool-specific fields.
run_outputs
Mixed aggregated results from all activated species-specific typing subworkflows. Each record contains tool-specific cross-sample summaries.
Subworkflow Composition
This subworkflow calls the following subworkflows:
- merlindist - Identify species from assembly and read data using Mash distances.
- clermontyping - Predict phylogroups of Escherichia coli from genome assemblies.
- ectyper - In silico prediction of Escherichia coli serotype.
- emmtyper - Predict emm types of Streptococcus pyogenes from genome assemblies.
- genotyphi - Assign genotypes to Salmonella Typhi genomes.
- hicap - In silico serotyping of the Haemophilus influenzae capsule locus.
- hpsuissero - Rapid Haemophilus parasuis serotyping.
- kleborate - Genotyping tool for Klebsiella pneumoniae and its related species complex.
- legsta - In silico Legionella pneumophila Sequence Based Typing.
- lissero - In silico serotype prediction for Listeria monocytogenes.
- ngmaster - Perform multi-antigen sequence typing of Neisseria gonorrhoeae from genome assemblies.
- pasty - Predict serogroups of Pseudomonas aeruginosa from assemblies.
- pbptyper - Predict penicillin binding protein (PBP) types of Streptococcus pneumoniae from genome assemblies.
- seqsero2 - Predict Salmonella serotypes from genome assemblies.
- seroba - k-mer based pipeline to identify the serotype of Streptococcus pneumoniae.
- shigapass - Predict serotypes of Shigella from assemblies.
- shigatyper - Predict serotypes of Shigella from reads or assemblies.
- shigeifinder - Predict serotypes of Shigella and EIEC from assemblies.
- sistr - Salmonella In Silico Typing Resource command-line tool.
- ssuissero - Predict serotypes of Streptococcus suis from genome assemblies.
- staphtyper - Determine the agr, spa and SCCmec types for Staphylococcus aureus genomes.
- stecfinder - Identify and serotype Shiga toxin-producing E. coli (STEC) from assemblies.
- tbprofiler - Profiling tool for Mycobacterium tuberculosis to detect resistance and strain type.
Used By
This subworkflow is used by the following workflows:
- bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
- merlin - MinMER-assisted species-specific tool selection and execution.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
Mash
Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17, 132 (2016)