sylph

Tags: metagenome profiling composition abundance kmer taxonomic sample-scope

Profile microbial composition using Sylph.

This subworkflow estimates microbial composition directly from sequencing reads using Sylph. It provides rapid and accurate abundance estimates by comparing k-mer signatures against a reference genome database. Sylph can process both short and long reads, offering taxonomic profiling from species to strain level with confidence estimates for each identification.

Uses explicit positional record fields for reads:

Input: record(meta, r1, r2, se, lr) where each read slot is Path?

Take

reads: Channel<Record>

Field	Description
`meta`	Groovy Record containing sample information
`r1`	Illumina R1 reads (paired-end)
`r2`	Illumina R2 reads (paired-end)
`se`	Single-end Illumina reads
`lr`	Long reads (ONT/PacBio)

database: Path

Name	Type	Description
`database`	`Path`	Path to Sylph reference database directory containing pre-computed k-mer signatures of reference genomes for taxonomic classification.

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

`sample_outputs`

Output	Description
`tsv`	TSV file with profiling results

`run_outputs`

Output	Description
`csv`	Aggregated profiling results in CSV format

Module Composition

This subworkflow calls the following modules:

sylph_profile - Profile metagenome samples against a database using Sylph.
csvtk_concat - Concatenate multiple CSV or TSV files into a single table.

Used By

This subworkflow is used by the following workflows:

sylph - Taxonomic profiling by abundance-corrected MinHash.

Citations

If you use this in your analysis, please cite the following.

Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020)
Sylph
Shaw J, and Yu YW Rapid species-level metagenome profiling and containment estimation with sylph. Nature Biotechnology (2024)

Source

View source on GitHub

Take​

Emit​

Published​

sample_outputs​

run_outputs​

Module Composition​

Used By​

Citations​

Source​