Skip to main content

csvtk_concat

Tags: utility table merge concat csv tsv csvtk run-scope

Concatenate multiple CSV or TSV files into a single table.

Uses csvtk concat to merge a list of delimited files by row. It handles header processing (keeping only one header) and supports format conversion (e.g., merging CSVs but outputting a TSV).

Inputs

record (
meta: Record,
csv: Set<Path>
)
FieldTypeDescription
metaRecordGroovy Record containing sample information
csvSet<Path>A list of CSV/TSV files to be concatenated
in_format: String
out_format: String
NameTypeDescription
in_formatStringInput format string ('csv', 'tsv', or a specific delimiter character)
out_formatStringOutput format string ('csv', 'tsv', or a specific delimiter character)

Outputs

record (
meta: Record,
csv: Path,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
FieldTypeDescription
metaRecordSample information record
csvPathConcatenated results from all samples in the specified output format
resultsSet<Path>All output files to be published
logsSet<Path?>Optional program specific log files
nf_logsSet<Path>Nextflow-specific log files (e.g. .command.{begin
versionsSet<Path>A YAML formatted file with program versions

Parameters

Used By

Subworkflows

  • abritamr - Identify antimicrobial resistance genes using AMRFinderPlus.
  • agrvate - Identify Staphylococcus aureus agr locus type and operon variants.
  • amrfinderplus - Find antimicrobial resistance genes and point mutations.
  • ariba - Rapidly identify genes by creating local assemblies from paired-end reads.
  • bactopia_assembler - Assemble bacterial genomes using automated assembler selection.
  • bactopia_gather - Search, validate, gather, and standardize input samples.
  • blastn - Search a nucleotide database using nucleotide query sequences.
  • blastp - Search protein sequences against protein database.
  • blastx - Translate nucleotide sequences and search protein database.
  • bracken - Estimate species abundance from metagenomic reads.
  • btyper3 - In silico taxonomic classification of Bacillus cereus group genomes.
  • busco - Assess genome assembly completeness using BUSCO.
  • checkm - Assess metagenome bin completeness using CheckM.
  • checkm2 - Assess metagenome bin completeness using CheckM2.
  • clermontyping - Predict phylogroups of Escherichia coli from genome assemblies.
  • defensefinder - Systematically search for anti-phage defense systems.
  • ectyper - In silico prediction of Escherichia coli serotype.
  • emmtyper - Predict emm types of Streptococcus pyogenes from genome assemblies.
  • fastani - Calculate Average Nucleotide Identity (ANI) between genomes.
  • gamma - Gene Allele Mutation Microbial Assessment.
  • genotyphi - Assign genotypes to Salmonella Typhi genomes.
  • gigatyper - Run all available MLST schemes for a species against an assembly
  • gtdb - Taxonomic classification with the Genome Taxonomy Database.
  • hicap - In silico serotyping of the Haemophilus influenzae capsule locus.
  • hpsuissero - Rapid Haemophilus parasuis serotyping.
  • kleborate - Genotyping tool for Klebsiella pneumoniae and its related species complex.
  • legsta - In silico Legionella pneumophila Sequence Based Typing.
  • lissero - In silico serotype prediction for Listeria monocytogenes.
  • mashdist - Calculate Mash distances between sequences and a reference.
  • mcroni - Scripts for finding and processing promoter variants upstream of mcr-1.
  • meningotype - Predict serotypes of Neisseria meningitidis from genome assemblies.
  • midas - Species-level profiling from metagenomic data.
  • mlst - Determine multilocus sequence types (MLST) from bacterial assemblies.
  • mobsuite - Reconstruct and type plasmids from bacterial genome assemblies.
  • mykrobe - Predict antibiotic resistance from sequence reads.
  • ngmaster - Perform multi-antigen sequence typing of Neisseria gonorrhoeae from genome assemblies.
  • pasty - Predict serogroups of Pseudomonas aeruginosa from assemblies.
  • pbptyper - Predict penicillin binding protein (PBP) types of Streptococcus pneumoniae from genome assemblies.
  • phispy - Prediction of prophages from bacterial genomes.
  • plasmidfinder - Identify plasmid replicons in bacterial genome assemblies.
  • quast - Evaluate assembly quality using QUAST.
  • rgi - Predict antimicrobial resistance from protein or nucleotide data.
  • sccmec - Identify SCCmec elements in Staphylococcus aureus genomes.
  • scrubber - Remove contaminant sequences from metagenomic data.
  • seqsero2 - Predict Salmonella serotypes from genome assemblies.
  • seroba - k-mer based pipeline to identify the serotype of Streptococcus pneumoniae.
  • shigapass - Predict serotypes of Shigella from assemblies.
  • shigatyper - Predict serotypes of Shigella from reads or assemblies.
  • shigeifinder - Predict serotypes of Shigella and EIEC from assemblies.
  • sistr - Salmonella In Silico Typing Resource command-line tool.
  • spatyper - Predict spa types of Staphylococcus aureus from genome assemblies.
  • ssuissero - Predict serotypes of Streptococcus suis from genome assemblies.
  • staphopiasccmec - Identify SCCmec elements in Staphylococcus aureus genomes using Staphopia method.
  • stecfinder - Identify and serotype Shiga toxin-producing E. coli (STEC) from assemblies.
  • sylph - Profile microbial composition using Sylph.
  • tblastn - Search protein query sequences against nucleotide database.
  • tblastx - Translate nucleotide query sequences and search nucleotide database.
  • teton - Perform taxonomic classification and estimate bacterial genome sizes.

Workflows

  • abritamr - A NATA accredited tool for reporting the presence of antimicrobial resistance genes.
  • agrvate - Rapid identification of Staphylococcus aureus agr locus type and agr operon variants.
  • amrfinderplus - Bactopia Tool: Amrfinderplus.
  • ariba - Gene identification through local assemblies.
  • bactopia - Comprehensive bacterial analysis pipeline for complete genomic characterization.
  • blastn - Search against nucleotide BLAST databases using nucleotide queries.
  • blastp - Search against protein BLAST databases using protein queries.
  • blastx - Search against protein BLAST databases using translated nucleotide queries.
  • bracken - Estimate taxonomic abundance of metagenomic samples.
  • btyper3 - Taxonomic classification of Bacillus cereus group isolates.
  • busco - Assessment of genome assembly completeness using evolutionarily informed expectations.
  • checkm - Assessment of microbial genome assembly quality.
  • checkm2 - Machine learning-based assessment of microbial genome assembly quality.
  • cleanyerreads - Quality control and optional host read removal from raw sequencing reads.
  • clermontyping - In silico phylotyping of Escherichia genus.
  • defensefinder - Systematic identification of anti-phage defense systems.
  • ectyper - In silico prediction of Escherichia coli serotype.
  • emmtyper - emm-typing of Streptococcus pyogenes assemblies.
  • fastani - Fast alignment-free computation of whole-genome Average Nucleotide Identity.
  • gamma - Identification, classification, and annotation of translated gene matches.
  • genotyphi - Salmonella Typhi genotyping with lineage assignment.
  • gigatyper - Run all available MLST schemes for a species against an assembly
  • gtdb - Identify marker genes and assign taxonomic classifications using GTDB.
  • hicap - Identify cap locus serotype and structure in Haemophilus influenzae assemblies.
  • hpsuissero - Serotype prediction of Haemophilus parasuis assemblies.
  • kleborate - Comprehensive screening of Klebsiella genomes for virulence and resistance determinants.
  • legsta - Sequence Based Typing (SBT) of Legionella pneumophila.
  • lissero - Serogroup typing prediction for Listeria monocytogenes.
  • mashdist - Calculate Mash distances between sequences and reference genomes.
  • mcroni - Sequence variation analysis of mcr-1 genes (mobilized colistin resistance).
  • meningotype - Comprehensive typing of Neisseria meningitidis.
  • midas - Estimate species abundances from metagenomic samples.
  • mlst - Automatic Multi-Locus Sequence Type (MLST) calling from assembled contigs.
  • mobsuite - Reconstruction and annotation of plasmids from bacterial genome assemblies.
  • mykrobe - Antimicrobial resistance detection for specific bacterial species.
  • ngmaster - Multi-antigen sequence typing of Neisseria gonorrhoeae.
  • pasty - In silico serogrouping of Pseudomonas aeruginosa isolates.
  • pbptyper - Penicillin Binding Protein (PBP) typing for Streptococcus pneumoniae.
  • phispy - Prediction of prophages in bacterial and archaeal genomes.
  • plasmidfinder - Bactopia Tool: Plasmidfinder.
  • quast - Quality assessment of assembled contigs using QUAST.
  • rgi - Prediction of antibiotic resistance genes using RGI.
  • sccmec - Typing of SCCmec cassettes in Staphylococcus aureus assemblies.
  • scrubber - Removal of human and contaminant sequences from metagenomic reads.
  • seqsero2 - Salmonella serotype prediction from sequencing reads or assemblies.
  • seroba - Serotyping of Streptococcus pneumoniae from Illumina paired-end reads.
  • shigapass - Prediction of Shigella serotypes and differentiation from EIEC.
  • shigatyper - Rapid determination of Shigella serotypes from sequencing reads.
  • shigeifinder - In silico serotype prediction for Shigella and Enteroinvasive E. coli (EIEC).
  • sistr - Serovar prediction of Salmonella enterica from assemblies.
  • spatyper - spa typing of Staphylococcus aureus assemblies.
  • ssuissero - Serotype prediction of Streptococcus suis assemblies.
  • staphopia - Comprehensive analysis pipeline for Staphylococcus aureus isolates.
  • stecfinder - Serotype identification of Shiga toxin-producing E. coli.
  • sylph - Taxonomic profiling by abundance-corrected MinHash.
  • tblastn - Search against translated nucleotide databases using protein queries.
  • tblastx - Search against translated nucleotide databases using translated nucleotide queries.
  • teton - Taxonomic classification and abundance profiling of metagenomic reads.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub

Version

CSVTK_CONCAT:
- csvtk: 0.31.0