Chewbbaca

Latest version: v3.3.10

Safety actively analyzes 682457 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 7

3.3.4

- Improved BLAST exception capturing.

- CreateSchema and Allelecall exit if input files include blank spaces in the filename.

- Removed global variable that could lead to issues during multiprocessing.

3.3.3

- Fixed warning related with BLASTp `--seqidlist` parameter. For BLAST>=2.10, the TXT file with the sequence IDs is converted to binary format with `blastdb_aliastool`.

- The `Bio.Application` modules are deprecated and might be removed from future Biopython versions. Modified the function that calls MAFFT so that it uses the subprocess module instead of `Bio.Align.Applications.MafftCommandline`. Changed the Biopython version requirement to >=1.79.

- Added a `pyproject.toml` configuration file and simplified the instructions in `setup.py`. The use of `setup.py` as a command line tool is deprecated and the `pyproject.toml` configuration file allows to install and build packages through the recommended method.

- Updated the Dockerfile to install chewBBACA with `python3 -m pip install .` instead of the deprecated `python setup.py install` command.

- Removed FASTA header integer conversion before running BLASTp. This was done to avoid a warning from BLAST related to sequence header length exceeding 50 characters.

- The seqids and coordinates of the CDSs closest to contig tips are stored in a dictionary during gene prediction to simplify LOTSC and PLOT5/3 determination (in many cases this reduces runtime by ~20%).

- Limited the number of values stored in memory while creating the `results_contigsInfo.tsv` and `results_alleles.tsv` output files to reduce memory usage.

- Adding data to the FASTA and TSV files for the missing classes per locus instead of storing the complete per input data to reduce memory usage.

- The data for novel alleles is saved to files to reduce memory usage.

- Fixed the in-frame stop codon count values displayed in the reports created by the SchemaEvaluator module.

- The `UniprotFinder` module now exits cleanly if the output directory already exists.

- Improved info printed to the stdout by the CreateSchema and AlleleCall modules, added comments, and changed variable names to better match data being stored.

3.3.2

- Changed FASTA file validation to reduce memory usage.

- Removed legacy schema conversion. Users should use the `PrepExternalSchema` module to adapt schemas created with chewBBACA<=2.1.0.

- Added prints about output files created by the `PrepExternalSchema` module.

3.3.1

- Fixed issue leading to errors during allele calling if it was running in default mode (4) and all CDSs were classified before representative determination.

- Fixed schema name assignment in the DownloadSchema module.

- Fixed bug related to gene prediction parallelization when running Pyrodigal in meta mode. Processes were hanging if `multiprocessing.pool.Pool` was used. Using `multiprocessing.pool.ThreadPool` fixes the issue. The solution was described in an [issue](https://github.com/althonos/pyrodigal/issues/46) in Pyrodigals' repository.

3.3.0

- Added the AlleleCallEvaluator module. This module generates an interactive HTML report for the allele calling results. The report provides summary statistics to evaluate results per sample and per locus (with the possibility to provide a TSV file with loci annotations to include on a table). The report includes components to display a heatmap representing the loci presence-absence matrix, a heatmap representing the distance matrix based on allelic differences and a Neighbor-Joining tree based on the MSA of the core genome loci.

- Added [pyrodigal](https://github.com/althonos/pyrodigal) for gene prediction. This simplified the processing of the gene prediction results and reduced runtime.

- Fixed an issue where the AlleleCall module would try to create results files for excluded inputs.

- Fixed exception capturing during multiprocessing when using Python>=3.11.

- Fixed PLOT5/3 identification when coding sequences are in the reverse strand.

- Fixed computation of the representative self-scores when performing allele calling for a subset of the loci in a schema (would only compute the self-scores for the subset of loci if the 'self_scores' file had still not been created).

- Fixed issue related to the classification of single EXC/INF and single/multiple ASM/ALM (would classify some inputs as NIPH instead of EXC/INF).

- Fixed issue related to protein exact match classification when multiple pre-computed PROTEINtable files include the same protein hash.

- Changed the `-i`, `--input-files` parameter in the PrepExternalSchema and UniprotFinder modules to `-g`, `--schema-directory` and added the `--gl`, `--genes-list` parameter to enable adapting or annotating a subset of the loci in the schema.

3.2.0

- New version of the SchemaEvaluator module. The updated version fixes several issues related with outdated dependencies that were leading to errors in the previous version. The new version also includes new features and components. Read the [docs page](https://chewbbaca.readthedocs.io/en/latest/user/modules/SchemaEvaluator.html) to know more about the latest version of the SchemaEvaluator module.
- Updated the link to the UniProt FTP used by the UniprotFinder module.
- Added the `.fas` file extension to the list of file extensions accepted by chewBBACA. chewBBACA accepts genome assemblies and external schemas with FASTA files that use any of the following file extensions: `.fasta`, `.fna`, `.ffn`, `.fa` and `.fas`. The FASTA files created by chewBBACA use the `.fasta` extension.
- Fixed issue in the PrepExternalSchema module where it would only detect FASTA files if they ended with the `.fasta` extension.
- Added the `--size-filter` parameter to the PrepExternalSchema module to define if the adaptation process should filter out alleles based on the minimum length and size threshold values.
- Added the `--output-novel` parameter to the AlleleCall module. If this parameter is used, the AlleleCall module creates a FASTA file with the novel alleles inferred during the allele calling. This file is created even if the `--no-inferred` parameter is used and the novel alleles are not added to the schema.

Page 2 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.