pirate
Tags: pan-genome orthologs core-genome gene-presence-absence epidemiology annotation run-scope
Pangenome Identification and Reconciliation Analysis Tool for Epidemiology (PIRATE).
Uses PIRATE to construct the pangenome of a collection of bacterial isolates. It clusters orthologous genes and generates the core genome alignment and a gene presence/absence matrix, which is compatible with downstream analysis tools like Scoary for association testing.
Inputs
record (
meta: Record,
gff: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Groovy Record containing sample information |
gff | Set<Path> | A list of annotated genome files in GFF3 format |
Outputs
record (
meta: Record,
aln: Path?,
csv: Path?,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Sample information record |
aln | Path? | The core-genome alignment (*core-genome.aln.gz), suitable for phylogenetic tree building |
csv | Path? | Gene presence/absence matrix in CSV format, compatible with Scoary |
results | Set<Path> | All output files to be published |
logs | Set<Path?> | Optional program specific log files |
nf_logs | Set<Path> | Nextflow-specific log files (e.g. .command.{begin |
versions | Set<Path> | A YAML formatted file with program versions |
Parameters
PIRATE Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--use_pirate | boolean | false | Use PIRATE instead of panaroo in the 'pangenome' subworkflow |
--pirate_steps | string | 50,60,70,80,90,95,98 | Percent identity thresholds to use for pangenome construction |
--pirate_features | string | CDS | Comma-delimited features to use for pangenome construction |
--pirate_para_off | boolean | false | Switch off paralog identification |
Used By
Subworkflows
- pirate - Build a pangenome from GFF3 annotations using PIRATE.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
PIRATE
Bayliss SC, Thorpe HA, Coyle NM, Sheppard SK, Feil EJ PIRATE: A fast and scalable pangenomics toolbox for clustering diverged orthologues in bacteria. Gigascience 8 (2019)
Source
Version
PIRATE:
- pirate: 1.0.5