Skip to main content

busco

Tags: assembly completeness quality assessment orthologs evaluation sample-scope

Assess genome assembly completeness using BUSCO.

This subworkflow evaluates genome assembly completeness by searching for single-copy orthologs against the BUSCO database. It generates comprehensive completeness reports including missing, duplicated, fragmented, and complete single-copy orthologs. The workflow includes individual sample assessments and a merged summary report across all samples.

Take

assembly: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
assemblyGenome assemblies to evaluate for completeness. Each record contains metadata
busco_lineage: String
NameTypeDescription
busco_lineageStringBUSCO lineage dataset to use for assessment (e.g., bacteria_odb10).

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

OutputDescription
tsvA text summary report of the completeness score (C/S/D/F/M%)
supplementalDirectory containing full tables, missing gene lists, and lineage data

run_outputs

OutputDescription
csvAggregated results in CSV format

Module Composition

This subworkflow calls the following modules:

  • csvtk_concat - Concatenate multiple CSV or TSV files into a single table.
  • busco - Assess genome assembly completeness using single-copy orthologs.

Used By

This subworkflow is used by the following workflows:

  • busco - Assessment of genome assembly completeness using evolutionarily informed expectations.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub