Skip to main content

nohuman

Tags: human contamination decontamination scrubbing reads nohuman kraken2 sample-scope

Remove human reads from sequencing data using nohuman.

This subworkflow uses nohuman to identify and remove human reads from FASTQ files using a Kraken2 database built from Human Pangenome Reference Consortium (HPRC) genomes. It optionally downloads the database if not already available.

Take

reads: Channel<Record>
FieldDescription
metaGroovy Record containing sample information
r1Illumina R1 reads (paired-end forward)
r2Illumina R2 reads (paired-end reverse)
seSingle-end Illumina reads
lrLong reads (ONT/PacBio)
database: Path?
download_nohuman: Boolean
save_as_tarball: Boolean
NameTypeDescription
databasePath?Path to nohuman database directory or tarball (ignored if download_nohuman is true)
download_nohumanBooleanBoolean flag to download the database instead of using the provided path
save_as_tarballBooleanBoolean flag to save downloaded database as tarball

Emit

Published

The sample_outputs and run_outputs emissions are aggregates of output files that will be published in the entry workflow.

sample_outputs

OutputDescription
scrubbedFASTQ files with human reads removed
scrub_reportKraken2 classification report (optional)

run_outputs

No run-scope outputs.

Module Composition

This subworkflow calls the following modules:

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub