bracken

Tags: metagenomics taxonomic-classification abundance-estimation kraken2 bracken sample-scope

Estimate species abundance from metagenomic reads.

This subworkflow performs taxonomic classification and abundance estimation using Kraken2 and Bracken. It processes metagenomic reads, classifies them against a reference database, and generates abundance estimates at different taxonomic levels with optional abundance correction.

Uses explicit positional record fields for reads:

Input: record(meta, r1, r2, se, lr) where each read slot is Path?

Take

reads: Channel<Record>

Field	Description
`meta`	Groovy Record containing sample information
`r1`	Illumina R1 reads (paired-end)
`r2`	Illumina R2 reads (paired-end)
`se`	Single-end Illumina reads
`lr`	Long reads (ONT/PacBio)

database: Path

Name	Type	Description
`database`	`Path`	Path to the Kraken2 database for taxonomic classification.

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

`sample_outputs`

Output	Description
`tsv`	Tab-delimited summary of Bracken primary and secondary species abundances
`special_meta`	A simplified metadata record for internal use
`classified`	Reads classified to belong to any of the taxa on the Kraken2 database
`unclassified`	Reads not classified to belong to any of the taxa on the Kraken2 database
`kraken2_report`	Kraken2 report containing stats about classified and not classified reads
`kraken2_output`	Kraken2 output file containing the taxonomic classification of each read
`bracken_report`	Bracken report containing stats about classified and not classified reads
`krona`	Interactive Krona HTML visualization
`abundances`	Bracken abundance estimates for each taxon
`classification`	Bracken per-read classification details
`adjusted_abundances`	Bracken abundance estimates adjusted for unclassified reads

`run_outputs`

Output	Description
`csv`	Aggregated results in CSV format

Module Composition

This subworkflow calls the following modules:

bracken - Taxonomic classification and abundance estimation.
csvtk_concat - Concatenate multiple CSV or TSV files into a single table.
csvtk_concat - Concatenate multiple CSV or TSV files into a single table.

Used By

This subworkflow is used by the following workflows:

bracken - Estimate taxonomic abundance of metagenomic samples.

Citations

If you use this in your analysis, please cite the following.

Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020)
Kraken2
Wood DE, Lu J, Langmead B Improved metagenomic analysis with Kraken 2. Genome Biology, 20(1), 257. (2019)
Bracken
Lu J, Breitwieser FP, Thielen P, and Salzberg SL Bracken: estimating species abundance in metagenomics data. PeerJ Computer Science, 3, e104. (2017)

Source

View source on GitHub

Take​

Emit​

Published​

sample_outputs​

run_outputs​

Module Composition​

Used By​

Citations​

Source​