blast_blastx
Tags: blast blastx alignment translation protein dna search fasta sample-scope
Search a protein database using a translated nucleotide query.
Uses BLASTX to translate nucleotide query sequences (FASTA) in all six reading frames and align them against a protein BLAST database. This is useful for identifying potential coding regions in unannotated DNA.
Inputs
record (
meta: Record,
blastdb: Path
)
| Field | Type | Description |
|---|---|---|
meta | Record | Groovy Record containing sample information |
blastdb | Path | A compressed tarball containing the protein BLAST database |
query: Path
| Name | Type | Description |
|---|---|---|
query | Path | FASTA file containing nucleotide query sequences |
Outputs
record (
meta: Record,
tsv: Path,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Sample information record |
tsv | Path | Tab-delimited translated nucleotide-to-protein alignment results (BLAST outfmt 6) |
results | Set<Path> | All output files to be published |
logs | Set<Path?> | Optional program specific log files |
nf_logs | Set<Path> | Nextflow-specific log files (e.g. .command.{begin |
versions | Set<Path> | A YAML formatted file with program versions |
Parameters
BLASTX Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--blastx_query | string | A fasta file containing the query sequences to BLAST against the database | |
--blastx_outfmt | string | sseqid qseqid pident qlen slen length nident positive mismatch gapopen gaps qstart qend sstart send evalue bitscore | The columns to include with -outfmt 6 |
--blastx_opts | string | Additional options to pass to BLASTN | |
--blastx_qcov_hsp_perc | integer | 50 | Percent query coverage per hsp |
--blastx_max_target_seqs | integer | 2000 | Maximum number of aligned sequences to keep |
Used By
Subworkflows
- blastx - Translate nucleotide sequences and search protein database.
Workflows
- blastx - Search against protein BLAST databases using translated nucleotide queries.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
BLAST
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009)
Source
Version
BLAST_BLASTX:
- blast: 2.17.0