pirate
Tags: pangenome pan-genome comparative-genomics core-genome alignment run-scope
Build a pangenome from GFF3 annotations using PIRATE.
This subworkflow creates a pangenome from bacterial genome annotations using PIRATE. PIRATE is a scalable pangenome toolbox that clusters orthologous genes at multiple identity thresholds. It is particularly useful for highly diverse datasets as it can handle divergent gene families and provides flexible clustering options for different analytical needs.
Take
gff: Channel<Record>
| Field | Description |
|---|---|
meta | Groovy Record containing sample information |
gff | Set of GFF3 annotation files representing the genomic annotations for each sample |
Emit
Published
The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.
sample_outputs
No sample-scope outputs.
run_outputs
| Output | Description |
|---|---|
aln | Core genome alignment in FASTA format (optional) |
csv | Gene presence/absence matrix in CSV format |
supplemental | Directory containing PIRATE intermediate files and detailed outputs |
Module Composition
This subworkflow calls the following modules:
- pirate - Pangenome Identification and Reconciliation Analysis Tool for Epidemiology (PIRATE).
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
PIRATE
Bayliss SC, Thorpe HA, Coyle NM, Sheppard SK, Feil EJ PIRATE: A fast and scalable pangenomics toolbox for clustering diverged orthologues in bacteria. Gigascience 8 (2019)