Skip to main content

Enhancements to Open-Source Software

Sustaining open source software is a difficult challenge that often demands substantial time and effort, usually without the benefit of recognition or support. The field of bioinformatics is no exception, as it heavily depends on tools maintained by individuals with little to no support. Bactopia is no different.

Recognizing these challenges, I designed Bactopia with an explicit goal of giving back to the community. To fulfill this aim, I incorporated several key design requirements:

  1. Tools must open source and free to use.
  2. Tools must be available from conda
  3. Bactopia Tools must be available on nf-core/modules
Bactopia has provided 181+ contributions to the bioinformatics community
  • 11 stand-alone tools, each available from Bioconda
  • 30 new Conda recipes, 46 updated recipes, and 2,000+ Bioconda pull requests reviewed.
  • 68 contributions to nf-core/modules
  • 26 contributions to other tools

These contributions are to the wider community, and do not require you to use Bactopia to take advantage of them.

Stand-Alone Tools

Sometimes, tools are developed to enhance Bactopia capabilities, such as Dragonflye, which was developed to add Nanopore support. These tools are designed to function as stand-alone tools. Below are 11 such tools, originally built for Bactopia, that you can also use independent of Bactopia.

ToolDescription
assembly-scanGenerate basic stats for an assembly
dragonflyeAssemble bacterial isolate genomes from Nanopore reads
fastq-dlDownload FASTQ files from SRA or ENA repositories.
fastq-scanOutput FASTQ summary statistics in JSON format
GOBLINGenerate trusted prOteins to supplement BacteriaL annotatIoN
pastyA tool for in silico serogrouping of Pseudomonas aeruginosa isolates
pbptyperIn silico Penicillin Binding Protein typer for Streptococcus pneumoniae
pmgaA fork of PMGA for all Neisseria species and Haemophilus influenzae
shovill-seA fork of Shovill that includes support for single end reads
staphopia-sccmecA standalone version of Staphopia’s SCCmec typing method
vcf-annotatorAdd biological annotations to variants in a given VCF file

Bioconda Contributions

Bactopia requires tools be installable with Conda to simplify the installation process for users. This requirement led to an unintended, but welcomed, deeper involvement with the Bioconda community. Bioconda is more than conda install, it is a valuable resource that makes bioinformatics tools more accessible to the community. Every time a tool is added to Bioconda, a Docker container is created by Biocontainers, as well as a Singularity image is created by the Galaxy Project. In essence, a single recipe contributes significantly to the broader community.

Bactopia has led to 30 new recipes, 46 updated recipes, and more than 2,000 pull requests have been reviewed.

New Recipes

Bactopia has led to the addition of 30 new recipes to Bioconda and conda-forge. These new recipes allow users to rapidly begin using these tools for their own analyses, and include:

ToolDescriptionPull Request
Aspera Connecthigh-performance transfer clientanaconda/rpetit3
assembly-scanGenerate basic stats for an assemblybioconda/bioconda-recipes#11425
bactopiaA flexible pipeline for complete analysis of bacterial genomesbioconda/bioconda-recipes#17434
DragonflyeAssemble bacterial isolate genomes from Nanopore readsbioconda/bioconda-recipes#29696
ena-dlDownload FASTQ files from ENAbioconda/bioconda-recipes#17354
EToKiall methods related to Enterobasebioconda/bioconda-recipes#37069
executorprogrammer friendly Python subprocess wrapperconda-forge/staged-recipes#9457
fastq-dlDownload FASTQ files from SRA or ENA repositories.bioconda/bioconda-recipes#18252
fastq-scanOutput FASTQ summary statistics in JSON formatbioconda/bioconda-recipes#11415
GenoTyphiassign genotypes to Salmonella Typhi genomesbioconda/bioconda-recipes#25674
GOBLINGenerate trusted prOteins to supplement BacteriaL annotatIoNbioconda/bioconda-recipes#38922
illumina-cleanupA simple pipeline for pre-processing Illumina FASTQ filesbioconda/bioconda-recipes#11481
ISMapperinsertion sequence mapping softwarebioconda/bioconda-recipes#14180
mashpitSketch-based surveillance platformbioconda/bioconda-recipes#35199
NextPolishFast and accurately polish the genome generated by long readsbioconda/bioconda-recipes#36582
ParallelTaskA simple and lightweight parallel task engineconda-forge/staged-recipes#19616
ParallelTaskA simple and lightweight parallel task engineconda-forge/staged-recipes#19616
pastyA tool for in silico serogrouping of Pseudomonas aeruginosa isolatesbioconda/bioconda-recipes#35930
pbptyperIn silico Penicillin Binding Protein typer for Streptococcus pneumoniaebioconda/bioconda-recipes#36222
pHierCCHierarchical clustering of cgMLSTbioconda/bioconda-recipes#37070
pmgaCommand-line version of PMGA (PubMLST Genome Annotator)bioconda/bioconda-recipes/#32801
property-manageruseful property variants for Python programmingconda-forge/staged-recipes#9442
RFPlasmidpredicting plasmid contigs from assembliesbioconda/bioconda-recipes#25849
SerotypeFinderIdentifies the serotype in total or partial sequenced isolates of E. colibioconda/bioconda-recipes#29718
shovill-seA fork of Shovill that includes support for single end readsbioconda/bioconda-recipes#26040
spaTypercomputational method for finding spa typesbioconda/bioconda-recipes#26044
sra-human-scrubberIdentify and remove human reads from FASTQ filesbioconda/bioconda-recipes#29926
staphopia-sccmecA standalone version of Staphopia's SCCmec typing methodbioconda/bioconda-recipes#28214
tbl2asn-foreveruse tbl2asn forever by pretending that it's still 2019bioconda/bioconda-recipes#20073
vcf-annotatorAdd biological annotations to variants in a given VCF filebioconda/bioconda-recipes#13417
Every recipe gets a Docker and Singularity container

Sometimes overlooked, its important to reiterate, every recipe added to Bioconda has a Docker container created by Biocontainers, and a Singularity container created by the Galaxy Project. These containers allow for version controlled reproducible analyses.

Enhancements and Fixes

A common issue with Bioconda recipes, is the tool works great in a Conda environment when containerized it fails for various reasons. When these issues occur with a tool used by Bactopia an effort is made to improve or fix the Bioconda recipe. Below is a list fixes and improvements to some Bioconda recipes:

ToolDescriptionPull Request
abriTAMRfix amrfinderplus pinning in abritamrbioconda/bioconda-recipes#46714
Gubbinsadjust python pinning in gubbinsbioconda/bioconda-recipes#46713
SISTRfix issue with sistr containerbioconda/bioconda-recipes#46712
RGIUpdate rgi pinning for pyrodigalbioconda/bioconda-recipes#46669
Snippypin tabix version in snippybioconda/bioconda-recipes#46458
ncbi-genome-downloadPatch ncbi-genome-download recipebioconda/bioconda-recipes#41640
GTDB-TkUpdate GTDB-tk recipebioconda/bioconda-recipes#40333
mlstupdate midas pinnings to match docsbioconda/bioconda-recipes#38826
MIDASupdate midas pinnings to match docsbioconda/bioconda-recipes#38566
smooverebuild smoove containerbioconda/bioconda-recipes#37394
fasta3update fasta3 to latest versionbioconda/bioconda-recipes#37306
pggbUpdate pinnings in pggbbioconda/bioconda-recipes#35734
NullarborRebuild nullarbor containerbioconda/bioconda-recipes#35687
GenoTyphiUpdate genotyphi recipe for mykrobe based analysisbioconda/bioconda-recipes#35388
SerobaAdd database to Seroba recipebioconda/bioconda-recipes#35378
AribaUpdate ariba dependencies for latest pymummerbioconda/bioconda-recipes#35383
pymummerpatch pymummer recipe to use system/user TMPbioconda/bioconda-recipes#35379
PlasmidFinderUpdate PlasmidFinder for better container supportbioconda/bioconda-recipes#35314
GTDB-TkAllow GTDB-Tk database download with containerbioconda/bioconda-recipes#35174
ShigaTyperupdate shigatyper recipe for better container supportbioconda/bioconda-recipes#35161
FastANIRemove fastani from build fail listbioconda/bioconda-recipes#33556
FastANIupdate FastANI recipebioconda/bioconda-recipes#33433
ProkkaUpdate Prokka bioperl pinningbioconda/bioconda-recipes#33411
SsuisSeroupdate SsuisSero dependencybioconda/bioconda-recipes#33268
RGIImprove RGI docker containerbioconda/bioconda-recipes#33249
legstaImprove dockerbuild for Legstabioconda/bioconda-recipes#33246
fastq-scanUpdate fastq-scan recipe to include jqbioconda/bioconda-recipes#32650
AribaPatch ariba recipe with minor bug fixesbioconda/bioconda-recipes#32258
PIRATEUpdate PIRATE recipe to include post-analysis scriptsbioconda/bioconda-recipes#31629
ngmasterrebuild ngmaster to get docker containerbioconda/bioconda-recipes#31376
AgrVATEadd missing dependency for agrvatebioconda/bioconda-recipes#31035
spaTyperPatch spatyper for entrypoint supportbioconda/bioconda-recipes#30824
spaTyperPatch spatyper for better container supportbioconda/bioconda-recipes#30622
KleborateUpdate kleborate recipe to build DBbioconda/bioconda-recipes#30582
cyvcf2Loosen htslib version requirement for cyvcf2bioconda/bioconda-recipes#30044
KleboratePatch Kleborate's method for discovering Kaptivebioconda/bioconda-recipes#29623
spaTyperupdate spatyper - drop blake_sha256 requirementbioconda/bioconda-recipes#27321
ISMapperISMapper - Fix BioPython pinningbioconda/bioconda-recipes#26599
CheckMcheckm-genome - fix broken pinning by older pysam versionbioconda/bioconda-recipes#25856
ISMapperUpdate ISMapper - Pin BioPython versionbioconda/bioconda-recipes#24314
AribaPatches for third party links used by Aribabioconda/bioconda-recipes#24010
SerobaAdd pysam pinning for Serobabioconda/bioconda-recipes#17568
AribaUpdate pysam pinning for Aribabioconda/bioconda-recipes#17448
tbl2asnPrevious version of tbl2asn has expired, updated to 25.7bioconda/bioconda-recipes#16131
ISMapperRebuild ismapper for GCC7 migrationbioconda/bioconda-recipes#14276
MentaLiSTMentaLiST v0.2.4 patch for Juliabioconda/bioconda-recipes#13137

nf-core/modules Contributions

When Bactopia transitioned to Nextflow DSL2, it opened the door to adopting modules from nf-core/modules. These modules enable users to seamlessly integrate them in their own Nextflow DSL2 pipelines. To support this integration, I decided to require each Bactopia Tool must have a corresponding module be available from nf-core/modules. If such a module is not already available, it will be added.

By adopting this practice, there have been 68 contributions to nf-core/modules in the form of new modules, module updates, and testing adjustments.

ToolDescriptionPull Request
BTyper3add module for btyper3nf-core/modules#3817
abriTAMRadd module for abritamr_runnf-core/modules#3725
PneumoCaTadd module for pneumocatnf-core/modules#3592
STECFinderadd module for stecfindernf-core/modules#2702
MIDASadd module for midas/runnf-core/modules#2696
SRA Human Scrubberadd modules for sra-human-scrubbernf-core/modules#2694
ShigEiFinderadd shigeifinder modulenf-core/modules#2523
nf-core/modulesfix a few tests after restructurenf-core/modules#2234
Biohanseladd biohansel modulenf-core/modules#2234
pbptyperadd pbptyper modulenf-core/modules#2005
pastyadd module for pastynf-core/modules#2003
snippy-coreadd snippy/core modulenf-core/modules#1855
Mykrobeadd module for mykrobe/predictnf-core/modules#1818
GenoTyphiadd module for genotyphi/parsenf-core/modules#1818
Serobaadd module for serobanf-core/modules#1816
PlasmidFinderadd plasmidfinder modulenf-core/modules#1773
mcroniadd mcroni modulenf-core/modules#1750
Aribaadd ariba modulenf-core/modules#1731
snippyadd snippy modulenf-core/modules#1643
ShigaTyperadd shigatyper modulenf-core/modules#1548
panarooadd module for panaroo, fix pirate testsnf-core/modules#1444
DragonflyeUpdate dragonflye to latest versionnf-core/modules#1442
Baktaupdate bakta to latest version (v1.4.0)nf-core/modules#1428
RoaryUpdate test.yml for Roary modulenf-core/modules#1419
HpsuisSeroadd hpsuisero modulenf-core/modules#1331
SsuisSeroadd ssuisero modulenf-core/modules#1329
SISTRadd sistr modulenf-core/modules#1322
RGIadd rgi modulenf-core/modules#1321
legstaadd legsta modulenf-core/modules#1319
AMRFinder+add amrfindplus modulenf-core/modules#1284
abricateadd abricate modulenf-core/modules#1280
mobsuite/reconadd mobsuite/recon modulenf-core/modules#1270
mash/distadd mash/dist modulenf-core/modules#1193
KleborateFix kleborate inputsnf-core/modules#1172
nf-core/modulesfix test data path for ClonalFrameML,roary,piratenf-core/modules#1085
Baktaadd bakta modulenf-core/modules#1085
nf-core/modulesuse underscores in anchors and referencesnf-core/modules#1080
Scoaryadd scoary modulenf-core/modules#1034
emmtyperadd emmtyper modulenf-core/modules#1028
LisSeroadd lissero modulenf-core/modules#1026
ngmasteradd ngmaster modulenf-core/modules#1024
meningotypeadd meningotype modulenf-core/modules#1022
SeqSero2add seqsero2 modulenf-core/modules#1016
ncbi-genome-downloadadd ncbi-genome-download modulenf-core/modules#980
ClonalFrameMLadd clonalframeml modulenf-core/modules#974
AgrVATEUpdate agrvate versionnf-core/modules#970
ECTyperadd ectyper modulenf-core/modules#948
TBProfileradd tbprofiler modulenf-core/modules#947
spaTyperUpdate spatyper module (cleanup debug)nf-core/modules#938
hicap[fix] hicap module allow optional outputsnf-core/modules#937
fastq-scanadd fastq-scan modulenf-core/modules#935
csvtkpatch output extension in csvtk/concatnf-core/modules#797
csvtkadd csvtk/concat modulenf-core/modules#785
spaTyperadd spatyper modulenf-core/modules#784
PIRATEadd pirate modulenf-core/modules#777
Roaryadd roary modulenf-core/modules#776
ISMapperadd ismapper modulenf-core/modules#773
hicapadd hicap modulenf-core/modules#772
mashtreeadd mashtree modulenf-core/modules#767
nf-core/modulesupdate tests for 12 modules for new confignf-core/modules#758
AgrVATEUpdate agrvate to v1.0.1nf-core/modules#728
staphopia-sccmecadd staphopia-sccmec modulenf-core/modules#702
Dragonflyeadd module for dragonflyenf-core/modules#633
nf-core/modulesupdate tests for 21 modules for new confignf-core/modules#384
ProkkaUpdate Prokka modules - add process labelnf-core/modules#350
nf-core/modulesREADME - Fix link describing process labelsnf-core/modules#349
ShovillUpdate shovill modulenf-core/modules#337
Prokkaadd prokka modulenf-core/modules#298

Other Contributions

In addition to Bioconda and nf-core/modules, Bactopia has made 26 contributions to other tools including:

ToolDescriptionPull Request
MOB-suitefix hostrange() missing 1 required positional argument: 'database_directory'phac-nml/mob-suite#149
bioconda-utilschore: update change visibility actionbioconda/bioconda-utils#873
ProkkaConvert Travis CI to Github Actionstseemann/prokka#662
bioconda-utilschore: add CI to changevisibility of private containersbioconda/bioconda-utils#835
bioconda-containersPatch - small fix on merge command and quay toggle visibilitybioconda/bioconda-containers#54
ShigatyperIncorporate patches from BiocondaCFSAN-Biostatistics/shigatyper#14
EToKilet tempfile determine where to put temp fileslskatz/EToKi#2
EToKiAllow multiple path parameters on the configure steplskatz/EToKi#1
Serobalet tempfile determine temp dir locationsanger-pathogens/seroba#68
pymummerallow the user to specify temp dir or use the system defaultsanger-pathogens/pymummer#36
ShigaTyperFix install processCFSAN-Biostatistics/shigatyper#10
legstause grep -q to play nice with bioconda docker buildtseemann/legsta#17
ShigaTyperAdd single-end and ONT support, add GitHub Actions, update readmeCFSAN-Biostatistics/shigatyper#9
AribaIgnore comments column and drop Bio.Alphabetsanger-pathogens/ariba#319
BioContainersAdd ClonalFrameML and maskrc-svg multipackageBioContainers/multi-package-containers#1923"
KleborateAdd --kaptive_path to specify path to kaptive datakatholt/Kleborate#59
Aribafix SPAdes version capturesanger-pathogens/ariba#315
AgrVATEFix for dots in sample namesVishnuRaghuram94/AgrVATE#9
PIRATEAdd minimum feature length optionSionBayliss/PIRATE#53
AribaFix for changes in PubMLST urlsanger-pathogens/ariba#305
AribaSolution 1: for fixing CARD downloadsanger-pathogens/ariba#302
bowtie2Rename VERSION to BOWTIE2_VERSIONBenLangmead/bowtie2#302
phyloFlashImproved single end supportHRGV/phyloFlash#102
ISMapperset min_range and max_range args to be a floatjhawkey/IS_mapper#38
maskrc-svgAdd requirements.txt for python moduleskwongj/maskrc-svg#2
ShovillAdded shovill-se for processing single-end readstseemann/shovill#105