csvtk_concat
Tags: utility table merge concat csv tsv csvtk run-scope
Concatenate multiple CSV or TSV files into a single table.
Uses csvtk concat to merge a list of delimited files by row. It handles header processing (keeping only one header) and supports format conversion (e.g., merging CSVs but outputting a TSV).
Inputs
record (
meta: Record,
csv: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Groovy Record containing sample information |
csv | Set<Path> | A list of CSV/TSV files to be concatenated |
in_format: String
out_format: String
| Name | Type | Description |
|---|---|---|
in_format | String | Input format string ('csv', 'tsv', or a specific delimiter character) |
out_format | String | Output format string ('csv', 'tsv', or a specific delimiter character) |
Outputs
record (
meta: Record,
csv: Path,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
| Field | Type | Description |
|---|---|---|
meta | Record | Sample information record |
csv | Path | Concatenated results from all samples in the specified output format |
results | Set<Path> | All output files to be published |
logs | Set<Path?> | Optional program specific log files |
nf_logs | Set<Path> | Nextflow-specific log files (e.g. .command.{begin |
versions | Set<Path> | A YAML formatted file with program versions |
Parameters
Used By
Subworkflows
- abritamr - Identify antimicrobial resistance genes using AMRFinderPlus.
- agrvate - Identify Staphylococcus aureus agr locus type and operon variants.
- amrfinderplus - Find antimicrobial resistance genes and point mutations.
- ariba - Rapidly identify genes by creating local assemblies from paired-end reads.
- bactopia_assembler - Assemble bacterial genomes using automated assembler selection.
- bactopia_gather - Search, validate, gather, and standardize input samples.
- blastn - Search a nucleotide database using nucleotide query sequences.
- blastp - Search protein sequences against protein database.
- blastx - Translate nucleotide sequences and search protein database.
- bracken - Estimate species abundance from metagenomic reads.
- btyper3 - In silico taxonomic classification of Bacillus cereus group genomes.
- busco - Assess genome assembly completeness using BUSCO.
- checkm - Assess metagenome bin completeness using CheckM.
- checkm2 - Assess metagenome bin completeness using CheckM2.
- clermontyping - Predict phylogroups of Escherichia coli from genome assemblies.
- defensefinder - Systematically search for anti-phage defense systems.
- ectyper - In silico prediction of Escherichia coli serotype.
- emmtyper - Predict emm types of Streptococcus pyogenes from genome assemblies.
- fastani - Calculate Average Nucleotide Identity (ANI) between genomes.
- gamma - Gene Allele Mutation Microbial Assessment.
- genotyphi - Assign genotypes to Salmonella Typhi genomes.
- gigatyper - Run all available MLST schemes for a species against an assembly
- gtdb - Taxonomic classification with the Genome Taxonomy Database.
- hicap - In silico serotyping of the Haemophilus influenzae capsule locus.
- hpsuissero - Rapid Haemophilus parasuis serotyping.
- kleborate - Genotyping tool for Klebsiella pneumoniae and its related species complex.
- legsta - In silico Legionella pneumophila Sequence Based Typing.
- lissero - In silico serotype prediction for Listeria monocytogenes.
- mashdist - Calculate Mash distances between sequences and a reference.
- mcroni - Scripts for finding and processing promoter variants upstream of mcr-1.
- meningotype - Predict serotypes of Neisseria meningitidis from genome assemblies.
- midas - Species-level profiling from metagenomic data.
- mlst - Determine multilocus sequence types (MLST) from bacterial assemblies.
- mobsuite - Reconstruct and type plasmids from bacterial genome assemblies.
- mykrobe - Predict antibiotic resistance from sequence reads.
- ngmaster - Perform multi-antigen sequence typing of Neisseria gonorrhoeae from genome assemblies.
- pasty - Predict serogroups of Pseudomonas aeruginosa from assemblies.
- pbptyper - Predict penicillin binding protein (PBP) types of Streptococcus pneumoniae from genome assemblies.
- phispy - Prediction of prophages from bacterial genomes.
- plasmidfinder - Identify plasmid replicons in bacterial genome assemblies.
- quast - Evaluate assembly quality using QUAST.
- rgi - Predict antimicrobial resistance from protein or nucleotide data.
- sccmec - Identify SCCmec elements in Staphylococcus aureus genomes.
- scrubber - Remove contaminant sequences from metagenomic data.
- seqsero2 - Predict Salmonella serotypes from genome assemblies.
- seroba - k-mer based pipeline to identify the serotype of Streptococcus pneumoniae.
- shigapass - Predict serotypes of Shigella from assemblies.
- shigatyper - Predict serotypes of Shigella from reads or assemblies.
- shigeifinder - Predict serotypes of Shigella and EIEC from assemblies.
- sistr - Salmonella In Silico Typing Resource command-line tool.
- spatyper - Predict spa types of Staphylococcus aureus from genome assemblies.
- ssuissero - Predict serotypes of Streptococcus suis from genome assemblies.
- staphopiasccmec - Identify SCCmec elements in Staphylococcus aureus genomes using Staphopia method.
- stecfinder - Identify and serotype Shiga toxin-producing E. coli (STEC) from assemblies.
- sylph - Profile microbial composition using Sylph.
- tblastn - Search protein query sequences against nucleotide database.
- tblastx - Translate nucleotide query sequences and search nucleotide database.
- teton - Perform taxonomic classification and estimate bacterial genome sizes.
Workflows
- abritamr - A NATA accredited tool for reporting the presence of antimicrobial resistance genes.
- agrvate - Rapid identification of Staphylococcus aureus agr locus type and agr operon variants.
- amrfinderplus - Bactopia Tool: Amrfinderplus.
- ariba - Gene identification through local assemblies.
- bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
- blastn - Search against nucleotide BLAST databases using nucleotide queries.
- blastp - Search against protein BLAST databases using protein queries.
- blastx - Search against protein BLAST databases using translated nucleotide queries.
- bracken - Estimate taxonomic abundance of metagenomic samples.
- btyper3 - Taxonomic classification of Bacillus cereus group isolates.
- busco - Assessment of genome assembly completeness using evolutionarily informed expectations.
- checkm - Assessment of microbial genome assembly quality.
- checkm2 - Machine learning-based assessment of microbial genome assembly quality.
- cleanyerreads - Quality control and optional host read removal from raw sequencing reads.
- clermontyping - In silico phylotyping of Escherichia genus.
- defensefinder - Systematic identification of anti-phage defense systems.
- ectyper - In silico prediction of Escherichia coli serotype.
- emmtyper - emm-typing of Streptococcus pyogenes assemblies.
- fastani - Fast alignment-free computation of whole-genome Average Nucleotide Identity.
- gamma - Identification, classification, and annotation of translated gene matches.
- genotyphi - Salmonella Typhi genotyping with lineage assignment.
- gigatyper - Run all available MLST schemes for a species against an assembly
- gtdb - Identify marker genes and assign taxonomic classifications using GTDB.
- hicap - Identify cap locus serotype and structure in Haemophilus influenzae assemblies.
- hpsuissero - Serotype prediction of Haemophilus parasuis assemblies.
- kleborate - Comprehensive screening of Klebsiella genomes for virulence and resistance determinants.
- legsta - Sequence Based Typing (SBT) of Legionella pneumophila.
- lissero - Serogroup typing prediction for Listeria monocytogenes.
- mashdist - Calculate Mash distances between sequences and reference genomes.
- mcroni - Sequence variation analysis of mcr-1 genes (mobilized colistin resistance).
- meningotype - Comprehensive typing of Neisseria meningitidis.
- midas - Estimate species abundances from metagenomic samples.
- mlst - Automatic Multi-Locus Sequence Type (MLST) calling from assembled contigs.
- mobsuite - Reconstruction and annotation of plasmids from bacterial genome assemblies.
- mykrobe - Antimicrobial resistance detection for specific bacterial species.
- ngmaster - Multi-antigen sequence typing of Neisseria gonorrhoeae.
- pasty - In silico serogrouping of Pseudomonas aeruginosa isolates.
- pbptyper - Penicillin Binding Protein (PBP) typing for Streptococcus pneumoniae.
- phispy - Prediction of prophages in bacterial and archaeal genomes.
- plasmidfinder - Bactopia Tool: Plasmidfinder.
- quast - Quality assessment of assembled contigs using QUAST.
- rgi - Prediction of antibiotic resistance genes using RGI.
- sccmec - Typing of SCCmec cassettes in Staphylococcus aureus assemblies.
- scrubber - Removal of human and contaminant sequences from metagenomic reads.
- seqsero2 - Salmonella serotype prediction from sequencing reads or assemblies.
- seroba - Serotyping of Streptococcus pneumoniae from Illumina paired-end reads.
- shigapass - Prediction of Shigella serotypes and differentiation from EIEC.
- shigatyper - Rapid determination of Shigella serotypes from sequencing reads.
- shigeifinder - In silico serotype prediction for Shigella and Enteroinvasive E. coli (EIEC).
- sistr - Serovar prediction of Salmonella enterica from assemblies.
- spatyper - spa typing of Staphylococcus aureus assemblies.
- ssuissero - Serotype prediction of Streptococcus suis assemblies.
- staphopia - Comprehensive analysis pipeline for Staphylococcus aureus isolates.
- stecfinder - Serotype identification of Shiga toxin-producing E. coli.
- sylph - Taxonomic profiling by abundance-corrected MinHash.
- tblastn - Search against translated nucleotide databases using protein queries.
- tblastx - Search against translated nucleotide databases using translated nucleotide queries.
- teton - Taxonomic classification and abundance profiling of metagenomic reads.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
csvtk
Shen, W csvtk: A cross-platform, efficient and practical CSV/TSV toolkit in Golang. (GitHub)
Source
Version
CSVTK_CONCAT:
- csvtk: 0.31.0