Skip to main content

kraken2

Tags: metagenomics taxonomy classification contamination scrubbing k-mer lca sample-scope

Taxonomic classification and host filtering of sequence reads.

Uses Kraken2 to assign taxonomic labels to short DNA reads by examining exact k-mer matches against a large reference database. It uses the Lowest Common Ancestor (LCA) algorithm to provide high-precision classification, making it ideal for metagenomics or removing host contamination (scrubbing).

Uses explicit positional record fields for reads:

  • Input: record(meta, r1, r2, se, lr) where each read slot is Path?
Database Required

Requires a standard Kraken2 database (directory or tarball). Memory usage depends on database size (Standard ~50GB).

Inputs

record (
meta: Record,
r1: Path?,
r2: Path?,
se: Path?,
lr: Path?
)
FieldTypeDescription
metaRecordGroovy Record containing sample information
r1Path?Illumina R1 reads (paired-end)
r2Path?Illumina R2 reads (paired-end)
sePath?Single-end Illumina reads
lrPath?Long reads (ONT/PacBio) - not typically used by Kraken2
db: Path
NameTypeDescription
dbPathKraken2 database (Directory or compressed tarball)

Outputs

record (
meta: Record,
special_meta: Record,
kraken2_report: Path,
scrub_report: Path?,
classified: Set<Path?>,
unclassified: Set<Path?>,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
FieldTypeDescription
metaRecordSample information record
special_metaRecordA simplified metadata record for internal use
kraken2_reportPathStandard Kraken2 report containing taxonomic abundance counts
scrub_reportPath?Summary report of reads removed during host scrubbing
classifiedSet<Path?>Reads assigned to a taxon in the database (FASTQ)
unclassifiedSet<Path?>Reads NOT assigned to any taxon (FASTQ)
resultsSet<Path>All output files to be published
logsSet<Path?>Optional program specific log files
nf_logsSet<Path>Nextflow-specific log files (e.g. .command.{begin
versionsSet<Path>A YAML formatted file with program versions

Parameters

Kraken2 Parameters

ParameterTypeDefaultDescription
--kraken2_dbstringThe a single tarball or path to a Kraken2 formatted database
--kraken2_confidencenumber0.0Confidence score threshold between 0 and 1
--kraken2_use_mpa_stylebooleanfalseFormat report output like Kraken 1's kraken-mpa-report
--kraken2_report_zero_countsbooleanfalseReport counts for ALL taxa, even if counts are zero

Used By

Subworkflows

  • kraken2 - Classify metagenomic reads using Kraken2.

Workflows

  • kraken2 - Taxonomic classification of metagenomic sequence reads.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub

Version

KRAKEN2:
- bactopia-teton: 1.1.3