With Bactopia v2 comes __a lot__ of changes! I would like to extend a huge thanks to Davi Marcon and Abhinav Sharma for their work initially converting Bactopia to DSL2. Your efforts were the momentum I needed to get the ball rolling on Bactopia v2. Thank you very much for taking your time to make such a siginificant contribution!
`Added`
- support for Nanopore reads
- `staphopia` as a named pipeline (alias for `bactopia --wf staphopia`) for _S. aureus_ genomes
- `bactopia/bactopia-tests` repo with test data
- walkthrough for testing
- `bactopia-datasets/staphylococcus_aureus` repo with curatated _S. aureus_ datasets
- per-module testing via `pytest` (100+ tests and 7000+ outputs tested)
- per-module `meta.yml` and `params.json` for auto-building docs site
- framefork for adding new Bactopia Tools
- 19 total Bactopia Tools (`bactopia --wf <NAME>`)
- Subworkflows (3)
- `eggnog`: Functional annotation of proteins using orthologous groups and phylogenies
- `pangenome`: Pangenome analysis with optional core-genome phylogeny
- `staphtyper`: Determine the agr, spa and SCCmec types for _Staphylococcus aureus_ genomes
- Modules (16):
- `agrvate`: Rapid identification of _Staphylococcus aureus_ agr locus type and agr operon variants.
- `bakta`: Rapid annotation of bacterial genomes and plasmids
- `ectyper`: In-silico prediction of _Escherichia coli_ serotype
- `emmtyper`: emm-typing of _Streptococcus pyogenes_ assemblies
- `fastani`: fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI)
- `hicap`: Identify cap locus serotype and structure in your _Haemophilus influenzae_ assemblies
- `ismapper`: Identify insertion sites positions in bacterial genomes
- `kleborate`: Screening Klebsiella genome assemblies for MLST, sub-species, and other related genes of interest
- `lissero`: Serogroup typing prediction for _Listeria monocytogenes_
- `mashtree`: Quickly create a tree using Mash distances
- `meningotype`: Serotyping of _Neisseria meningitidis_
- `ngmaster`: Multi-antigen sequence typing for _Neisseria gonorrhoeae_
- `seqsero2`: Salmonella serotype prediction from reads or assemblies
- `spatyper`: Computational method for finding spa types in _Staphylococcus aureus_
- `staphopiasccmec`: Primer based SCCmec typing of _Staphylococcus aureus_ genomes
- `tbprofiler`: Detect resistance and lineages of _Mycobacterium tuberculosis_ genomes
- Use `mamba` instead of conda for env building
- Reduced total Conda envs/Docker containers down to 7 (previously 12 not including bactopia tools)
- default to compressed outputs (`--skip_compression` to output uncompressed outputs)
- Tutorial outputs made available
- update github actions
`Fixed`
- Cache issue causing `-resume` to fail
- amrfinder+ database not compatible error
- incorrectly parsed system memory
Adapted from `nf-core`
- nf-core pytest setup
- nf-core/modules for bactopia tools
- Bactopia v2 release contributed 20+ modules to nf-core/modules
- nf-core/tools arg parser
- adapted to import params and usage based on config file
Process Consolidation
- `makeblastdb` -> `assemble_genome`
- `call_variants`, `download_reference` -> `call_variants`
- `fastq_status`, `estiamte_genome_size` -> `gather_samples`
- `count_31mers` -> `minmer_sketch`
`Removed`
- `bactopia tools` -> Handled by Nextflow now (`bactopia --wf <NAME>`)
- `bactopia versions` -> Program versions are output ever run now.