panaroo_run

Tags: pan-genome orthologs core-genome gene-presence-absence graph-based annotation run-scope

Fast and scalable bacterial pangenome analysis using a graph-based approach.

Uses Panaroo to cluster genes from multiple annotated bacterial genomes into orthologous groups, correcting for gene splitting and merges. The primary outputs are the gene presence/absence matrix (the pan-genome) and a core-genome alignment (for phylogenetics).

Inputs

record (
    meta: Record,
    gff: Set<Path>
)

Field	Type	Description
`meta`	`Record`	Groovy Record containing sample information
`gff`	`Set<Path>`	A list of annotated genome files in GFF3 format (required input)

Outputs

record (
    meta: Record,
    aln: Path?,
    filtered_aln: Path?,
    csv: Path?,
    panaroo_csv: Path?,
    results: Set<Path>,
    logs: Set<Path?>,
    nf_logs: Set<Path>,
    versions: Set<Path>
)

Field	Type	Description
`meta`	`Record`	Sample information record
`aln`	`Path?`	The core-genome alignment (*core-genome.aln.gz), suitable for phylogenetic tree building
`filtered_aln`	`Path?`	The core-genome alignment with highly recombinant regions filtered out
`csv`	`Path?`	Gene presence/absence matrix in Roary-compatible CSV format
`panaroo_csv`	`Path?`	Gene presence/absence matrix in Panaroo's native CSV format
`results`	`Set<Path>`	All output files to be published
`logs`	`Set<Path?>`	Optional program specific log files
`nf_logs`	`Set<Path>`	Nextflow-specific log files (e.g. .command.{begin
`versions`	`Set<Path>`	A YAML formatted file with program versions

Parameters

Panaroo Run Parameters

Parameter	Type	Default	Description
`--panaroo_merge_paralogs`	boolean	`false`	Do not split paralogs
`--panaroo_opts`	string		Additional options to pass to panaroo

Used By

Subworkflows

panaroo - Build a pangenome from GFF3 annotations using Panaroo.

Citations

If you use this in your analysis, please cite the following.

Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020)
Panaroo
Tonkin-Hill G, MacAlasdair N, Ruis C, Weimann A, Horesh G, Lees JA, Gladstone RA, Lo S, Beaudoin C, Floto RA, Frost SDW, Corander J, Bentley SD, Parkhill J Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biology 21(1), 180. (2020)
MAFFT
Katoh K, Standley DM MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772-780 (2013)

Source

View source on GitHub

Version

PANAROO_RUN:
    - panaroo: 1.6.0

Inputs​

Outputs​

Parameters​

Panaroo Run Parameters​

Used By​

Subworkflows​

Citations​

Source​

Version​