snippy_run
Tags: variant-calling snp reference-mapping phylogenetics outbreak sample-scope
Call variants against a reference genome using Snippy.
This subworkflow performs rapid haploid variant calling from bacterial sequence reads using Snippy. It maps reads to a reference genome, identifies SNPs and indels, and generates consensus sequences. The tool produces multiple output formats including VCF, aligned FASTA, and annotated variants for downstream phylogenetic analysis with snippy-core.
Uses explicit positional record fields for reads:
- Input: record(meta, r1, r2, se, lr) where each read slot is Path?
Take
reads: Channel<Record>
| Field | Description |
|---|---|
meta | Groovy Record containing sample information |
r1 | Illumina R1 reads (paired-end) |
r2 | Illumina R2 reads (paired-end) |
se | Single-end Illumina reads |
lr | Long reads (ONT/PacBio) |
reference: Path
| Name | Type | Description |
|---|---|---|
reference | Path | Reference genome in GenBank format (preferred, for annotation) or FASTA format |
Emit
Published
The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.
sample_outputs
| Output | Description |
|---|---|
aligned_fa | A version of the reference with - at zero coverage positions |
vcf | The final annotated variants in VCF format |
aligned_fa_error | Aligned FASTA file generated during error state |
vcf_error | VCF file generated during error state |
error | Error log text file |
annotated_vcf | Annotated VCF file |
bam | The alignments in BAM format (includes unmapped/multimapping) |
bai | Index for the BAM file |
bed | The variants in BED format |
consensus_fa | Reference genome with all variants instantiated |
consensus_subs_fa | Reference genome with only substitution variants instantiated |
consensus_subs_masked_fa | Reference genome with substitutions instantiated and low coverage masked |
coverage | Per-base coverage depth information |
csv | A comma-separated summary of variants |
filt_vcf | The filtered variant calls from Freebayes |
gff | The variants in GFF3 format |
html | A HTML summary of the variants |
raw_vcf | The unfiltered variant calls from Freebayes |
subs_vcf | VCF containing only substitution variants |
tab | A simple tab-separated summary of all variants |
txt | Tab-separated columnar list of alignment statistics |
run_outputs
No run-scope outputs.
Downstream Inputs
The following emissions are meant to be used as inputs to downstream subworkflows.
variants
Per-sample VCFs and aligned FAs filtered to only samples with variant data
Module Composition
This subworkflow calls the following modules:
- snippy_run - Rapid haploid variant calling and core genome alignment.
Used By
This subworkflow is used by the following workflows:
- snippy - Rapid haplotype variant calling and core genome alignment.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
Snippy
Seemann T Snippy: fast bacterial variant calling from NGS reads (GitHub)