Skip to main content

bactopia_assembler

Tags: bacteria assembly hybrid shovill dragonflye unicycler illumina nanopore sample-scope

Assemble bacterial genomes using automated assembler selection.

This subworkflow automatically selects the optimal assembly strategy based on input read types:

  • Short Paired-End Reads: Uses Shovill (SKESA/SPAdes wrapper)
  • Short Single-End Reads: Uses Shovill-SE (SKESA/SPAdes wrapper)
  • Long Reads: Uses Dragonflye (Flye/Miniasm wrapper)
  • Hybrid Assembly: Uses Unicycler or Dragonflye with short-read polishing

The workflow performs individual assemblies per sample and aggregates assembly statistics across all samples using assembly-scan for comprehensive quality assessment.

Take

samples: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
r1Illumina R1 reads (paired-end forward)
r2Illumina R2 reads (paired-end reverse)
seSingle-end Illumina reads
lrLong reads (ONT/PacBio) for long-read or hybrid assembly

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

OutputDescription
tsvTab-delimited report of assembly statistics (N50, length, coverage)
supplementalSupplemental files including assembly graphs and tool-specific logs
errorCaptured error messages if assembly fails

run_outputs

OutputDescription
csvAggregated assembly statistics from all samples

Downstream Inputs

The following emissions are meant to be used as inputs to downstream subworkflows.

assembly

OutputDescription
fnaAssembled contigs for downstream annotation and analysis

assembly_reads

OutputDescription
fnaAssembled contigs
r1Illumina R1 reads (paired-end forward)
r2Illumina R2 reads (paired-end reverse)
seSingle-end Illumina reads
lrLong reads (ONT/PacBio)

Module Composition

This subworkflow calls the following modules:

  • bactopia_assembler - Assemble bacterial genomes using short read, long read, or hybrid strategies.
  • csvtk_concat - Concatenate multiple CSV or TSV files into a single table.

Used By

This subworkflow is used by the following workflows:

  • bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
  • staphopia - Comprehensive analysis pipeline for Staphylococcus aureus isolates.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub