csvtk_join

Tags: utility table join merge csv tsv csvtk relational run-scope

Join two CSV or TSV files based on common fields.

Uses csvtk join to merge two tabular files horizontally by matching values in a specified key column (similar to a SQL JOIN). It supports inner, left, right, and outer joins via optional arguments.

Inputs

record (
    meta: Record,
    csv1: Path,
    csv2: Path
)

Field	Type	Description
`meta`	`Record`	Groovy Record containing sample information
`csv1`	`Path`	The first CSV/TSV file (Left table)
`csv2`	`Path`	The second CSV/TSV file (Right table)

in_format: String
out_format: String
key: String

Name	Type	Description
`in_format`	`String`	Input format string ('csv', 'tsv', or a specific delimiter character)
`out_format`	`String`	Output format string ('csv', 'tsv', or a specific delimiter character)
`key`	`String`	The column name(s) or index(es) to use as the join key (e.g., "sample_id" or "1")

Outputs

record (
    meta: Record,
    csv: Path,
    results: Set<Path>,
    logs: Set<Path?>,
    nf_logs: Set<Path>,
    versions: Set<Path>
)

Field	Type	Description
`meta`	`Record`	Sample information record
`csv`	`Path`	The joined tabular file (.csv or .tsv)
`results`	`Set<Path>`	All output files to be published
`logs`	`Set<Path?>`	Optional program specific log files
`nf_logs`	`Set<Path>`	Nextflow-specific log files (e.g. .command.{begin
`versions`	`Set<Path>`	A YAML formatted file with program versions

Parameters

Used By

Subworkflows

teton - Perform taxonomic classification and estimate bacterial genome sizes.

Workflows

teton - Taxonomic classification and abundance profiling of metagenomic reads.

Citations

If you use this in your analysis, please cite the following.

Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020)
csvtk
Shen, W csvtk: A cross-platform, efficient and practical CSV/TSV toolkit in Golang. (GitHub)

Source

View source on GitHub

Version

CSVTK_JOIN:
    - csvtk: 0.31.0

Inputs​

Outputs​

Parameters​

Used By​

Subworkflows​

Workflows​

Citations​

Source​

Version​