panaroo_run
Tags: pan-genome orthologs core-genome gene-presence-absence graph-based annotation run-scope
Fast and scalable bacterial pangenome analysis using a graph-based approach.
Uses Panaroo to cluster genes from multiple annotated bacterial genomes into orthologous groups, correcting for gene splitting and merges. The primary outputs are the gene presence/absence matrix (the pan-genome) and a core-genome alignment (for phylogenetics).
Inputs
record (
meta: Record,
gff: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Groovy Record containing sample information |
gff | Set<Path> | A list of annotated genome files in GFF3 format (required input) |
Outputs
record (
meta: Record,
aln: Path?,
filtered_aln: Path?,
csv: Path?,
panaroo_csv: Path?,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Sample information record |
aln | Path? | The core-genome alignment (*core-genome.aln.gz), suitable for phylogenetic tree building |
filtered_aln | Path? | The core-genome alignment with highly recombinant regions filtered out |
csv | Path? | Gene presence/absence matrix in Roary-compatible CSV format |
panaroo_csv | Path? | Gene presence/absence matrix in Panaroo's native CSV format |
results | Set<Path> | All output files to be published |
logs | Set<Path?> | Optional program specific log files |
nf_logs | Set<Path> | Nextflow-specific log files (e.g. .command.{begin |
versions | Set<Path> | A YAML formatted file with program versions |
Parameters
Panaroo Run Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--panaroo_merge_paralogs | boolean | false | Do not split paralogs |
--panaroo_opts | string | Additional options to pass to panaroo |
Used By
Subworkflows
- panaroo - Build a pangenome from GFF3 annotations using Panaroo.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
Panaroo
Tonkin-Hill G, MacAlasdair N, Ruis C, Weimann A, Horesh G, Lees JA, Gladstone RA, Lo S, Beaudoin C, Floto RA, Frost SDW, Corander J, Bentley SD, Parkhill J Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biology 21(1), 180. (2020) -
MAFFT
Katoh K, Standley DM MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772-780 (2013)
Source
Version
PANAROO_RUN:
- panaroo: 1.6.0