Skip to main content

merlin

Tags: species identification typing serotype virulence sample-scope

MinER assisted species-specific bactopia tool seLectIoN.

This subworkflow performs intelligent species identification and selects appropriate species-specific typing tools based on the detected organism. It first identifies potential species using MinHash distance estimation, then runs species-specific subworkflows for detailed characterization including serotyping, MLST, virulence factor detection, and antimicrobial resistance profiling.

Take

assembly: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
fnaAssembly file for species identification and typing
r1Illumina R1 reads (paired-end) or null
r2Illumina R2 reads (paired-end) or null
seSingle-end Illumina reads or null
lrLong reads (ONT/PacBio) or null
mash_db: Path
emmtyper_blastdb: Path?
hicap_database_dir: Path?
hicap_model_fp: Path?
staphtyper_repeats: Path?
staphtyper_repeat_order: Path?
NameTypeDescription
mash_dbPathMash sketch database for rapid species identification
emmtyper_blastdbPath?EMMTyper BLAST database for Streptococcus pyogenes emm typing (optional)
hicap_database_dirPath?HiCAP database directory for Haemophilus influenzae serotyping (optional)
hicap_model_fpPath?HiCAP HMM model file for improved detection (optional)
staphtyper_repeatsPath?Staphylococcus aureus repeat sequences for spa typing (optional)
staphtyper_repeat_orderPath?Staphylococcus aureus repeat order file for spa typing (optional)

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

Mixed per-sample records from merlindist and all activated species-specific typing subworkflows (e.g., ectyper, sistr, kleborate). Each record carries tool-specific fields.

run_outputs

Mixed aggregated results from all activated species-specific typing subworkflows. Each record contains tool-specific cross-sample summaries.

Subworkflow Composition

This subworkflow calls the following subworkflows:

  • merlindist - Identify species from assembly and read data using Mash distances.
  • clermontyping - Predict phylogroups of Escherichia coli from genome assemblies.
  • ectyper - In silico prediction of Escherichia coli serotype.
  • emmtyper - Predict emm types of Streptococcus pyogenes from genome assemblies.
  • genotyphi - Assign genotypes to Salmonella Typhi genomes.
  • hicap - In silico serotyping of the Haemophilus influenzae capsule locus.
  • hpsuissero - Rapid Haemophilus parasuis serotyping.
  • kleborate - Genotyping tool for Klebsiella pneumoniae and its related species complex.
  • legsta - In silico Legionella pneumophila Sequence Based Typing.
  • lissero - In silico serotype prediction for Listeria monocytogenes.
  • ngmaster - Perform multi-antigen sequence typing of Neisseria gonorrhoeae from genome assemblies.
  • pasty - Predict serogroups of Pseudomonas aeruginosa from assemblies.
  • pbptyper - Predict penicillin binding protein (PBP) types of Streptococcus pneumoniae from genome assemblies.
  • seqsero2 - Predict Salmonella serotypes from genome assemblies.
  • seroba - k-mer based pipeline to identify the serotype of Streptococcus pneumoniae.
  • shigapass - Predict serotypes of Shigella from assemblies.
  • shigatyper - Predict serotypes of Shigella from reads or assemblies.
  • shigeifinder - Predict serotypes of Shigella and EIEC from assemblies.
  • sistr - Salmonella In Silico Typing Resource command-line tool.
  • ssuissero - Predict serotypes of Streptococcus suis from genome assemblies.
  • staphtyper - Determine the agr, spa and SCCmec types for Staphylococcus aureus genomes.
  • stecfinder - Identify and serotype Shiga toxin-producing E. coli (STEC) from assemblies.
  • tbprofiler - Profiling tool for Mycobacterium tuberculosis to detect resistance and strain type.

Used By

This subworkflow is used by the following workflows:

  • bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
  • merlin - MinMER-assisted species-specific tool selection and execution.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub