Skip to main content

checkm2

Tags: metagenome bin completeness contamination mag quality machine-learning sample-scope

Assess metagenome bin completeness using CheckM2.

This subworkflow evaluates the quality and completeness of metagenome-assembled genomes (MAGs) using CheckM2. It provides an improved assessment using machine learning models trained on high-quality reference genomes, offering more accurate completeness and contamination estimates. The workflow can either download the required database or use a user-provided database path.

Take

assembly: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
assemblyMetagenome-assembled genome bins to evaluate. Each record contains metadata
database: Path
download_checkm2: Boolean
NameTypeDescription
databasePathPath to CheckM2 database directory. If download_checkm2 is true, this can be a placeholder as the database will be downloaded automatically.
download_checkm2BooleanBoolean flag to automatically download the CheckM2 database if not available. When true, downloads the required reference database before prediction.

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

OutputDescription
tsvA tab-delimited report of quality metrics (Completeness, Contamination)
supplementalDirectory containing intermediate protein files and Diamond alignments

run_outputs

OutputDescription
csvAggregated results in CSV format

Module Composition

This subworkflow calls the following modules:

Used By

This subworkflow is used by the following workflows:

  • checkm2 - Machine learning-based assessment of microbial genome assembly quality.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub