snippy_core
Tags: variant-calling core-genome snp alignment phylogenetics run-scope
Generate core-genome SNP alignment from per-sample Snippy outputs.
This subworkflow aggregates individual Snippy variant calls to produce a core-genome alignment using snippy-core. It identifies core SNPs present across all samples, generates a clean alignment suitable for phylogenetic analysis, and calculates pairwise SNP distances using snp-dists. The output can be used directly with tree-building tools like IQ-TREE, RAxML, or Gubbins.
Take
alignments: Channel<Record>
reference: Path
mask: Path?
| Name | Type | Description |
|---|---|---|
alignments | `` | Channel containing per-sample aligned FASTA files and VCFs from Snippy runs |
reference | Path | Reference genome in GenBank or FASTA format used for variant calling |
mask | Path? | Optional BED file of regions to mask from the core alignment (e.g., recombinant regions, repeat regions) |
Emit
Published
The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.
sample_outputs
No sample-scope outputs.
run_outputs
| Output | Description |
|---|---|
aln | Core SNP alignment in FASTA format (polymorphic sites only) |
full_aln | Full core alignment including monomorphic sites |
clean_full_aln | Cleaned full alignment with constant sites for phylogenetic inference |
tab | Core SNPs in TAB format |
vcf | Core SNPs in VCF format |
txt | Core summary statistics (number of SNPs, core genome size) |
samples | List of samples included in the core alignment |
supplemental | Individual sample alignments and intermediate files |
tsv | Pairwise SNP distance matrix from snp-dists |
Downstream Inputs
The following emissions are meant to be used as inputs to downstream subworkflows.
alignment
| Output | Description |
|---|---|
aln | Core-SNP alignment for downstream phylogenetic analysis |
Subworkflow Composition
This subworkflow calls the following subworkflows:
- snpdists - Calculate pairwise SNP distances from sequence alignments.
Module Composition
This subworkflow calls the following modules:
- snippy_core - Core-SNP alignment from Snippy outputs.
Used By
This subworkflow is used by the following workflows:
- snippy - Rapid haplotype variant calling and core genome alignment.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
Snippy
Seemann T Snippy: fast bacterial variant calling from NGS reads (GitHub) -
snp-dists
Seemann T snp-dists - Pairwise SNP distance matrix from a FASTA sequence alignment. (GitHub)