panaroo

Tags: pangenome pan-genome comparative-genomics core-genome alignment run-scope

Build a pangenome from GFF3 annotations using Panaroo.

This subworkflow creates a pangenome from bacterial genome annotations using Panaroo. Panaroo is a pangenome pipeline that produces polished pangenomes by removing errors and contamination from input annotations. It generates gene presence/absence matrices and core-genome alignments suitable for downstream phylogenetic analysis.

Take

gff: Channel<Record>

Field	Description
`meta`	Groovy Record containing sample information
`gff`	Set of GFF3 annotation files representing the genomic annotations for each sample

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

`sample_outputs`

No sample-scope outputs.

`run_outputs`

Output	Description
`aln`	Core genome alignment in FASTA format (optional)
`filtered_aln`	Core genome alignment with recombinant regions filtered out (optional)
`csv`	Gene presence/absence matrix in Roary-compatible CSV format (optional)
`panaroo_csv`	Gene presence/absence matrix in Panaroo's native CSV format (optional)
`supplemental`	Directory containing Panaroo intermediate files and data structures

Module Composition

This subworkflow calls the following modules:

panaroo_run - Fast and scalable bacterial pangenome analysis using a graph-based approach.

Citations

If you use this in your analysis, please cite the following.

Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020)
Panaroo
Tonkin-Hill G, MacAlasdair N, Ruis C, Weimann A, Horesh G, Lees JA, Gladstone RA, Lo S, Beaudoin C, Floto RA, Frost SDW, Corander J, Bentley SD, Parkhill J Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biology 21(1), 180. (2020)

Source

View source on GitHub

Take​

Emit​

Published​

sample_outputs​

run_outputs​

Module Composition​

Citations​

Source​