Skip to main content

bactopia_qc

Tags: fastq qc adapter-removal error-correction subsampling fastp bbduk lighter porechop nanoq fastqc nanoplot sample-scope

Automated quality control, error correction, and read subsampling.

A comprehensive QC pipeline that adapts to the input read type:

  • Illumina: Adapter/PhiX removal (Fastp or BBDuk), Error Correction (Lighter), and Subsampling (Rasusa)
  • Nanopore: Adapter removal (Porechop), Quality filtering (Nanoq), and Subsampling (Rasusa)
  • Hybrid: Processes both short and long reads through their respective pipelines
  • Assembly: Passes through simulated reads from assemblies

Generates quality metrics using fastq-scan and optional quality reports using FastQC (Illumina) and NanoPlot (ONT).

Inputs

record (
meta: Record,
r1: Path?,
r2: Path?,
se: Path?,
lr: Path?,
fna: Path?
)
FieldTypeDescription
metaRecordGroovy Record containing sample information (must include runtype, genome_size, species)
r1Path?Illumina R1 reads (paired-end forward)
r2Path?Illumina R2 reads (paired-end reverse)
sePath?Single-end Illumina reads
lrPath?Long reads (ONT)
fnaPath?Assembly file (FASTA) for assembly-based simulations
adapters: Path?
phix: Path?
NameTypeDescription
adaptersPath?Filepath for custom adapter sequences (FASTA)
phixPath?Filepath for custom PhiX sequences (FASTA)

Outputs

record (
meta: Record,
r1: Path?,
r2: Path?,
se: Path?,
lr: Path?,
fna: Path?,
reads_grouped: Set<Path?>,
error: Set<Path?>,
skipped: Path?,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
FieldTypeDescription
metaRecordSample information record
r1Path?QC'd Illumina R1 reads (paired-end forward)
r2Path?QC'd Illumina R2 reads (paired-end reverse)
sePath?QC'd single-end Illumina reads
lrPath?QC'd long reads (ONT)
fnaPath?Assembly file (FASTA)
reads_groupedSet<Path?>All output FASTQs for publishing
errorSet<Path?>Captured error messages if QC failed (e.g., reads empty after trimming)
skippedPath?Marker file indicating QC was skipped for this sample
resultsSet<Path>All output files to be published
logsSet<Path?>Optional program specific log files
nf_logsSet<Path>Nextflow-specific log files (e.g. .command.{begin
versionsSet<Path>A YAML formatted file with program versions

Parameters

Used By

Subworkflows

  • bactopia_qc - Perform comprehensive quality control on sequencing reads.

Workflows

  • bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
  • cleanyerreads - Quality control and optional host read removal from raw sequencing reads.
  • staphopia - Comprehensive analysis pipeline for Staphylococcus aureus isolates.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub

Version

BACTOPIA_QC:
- bactopia-qc: 1.0.4