Phold

Latest version: v0.2.0

Safety actively analyzes 682487 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.2.0

* Improved sensitivity and faster runtime for the `foldseek` search. This is achieved by clustering the Phold database at `--min-seq-id 0.3 -c 0.8` and creating a cluster db before running with `foldseek` which significantly improves runtime
* Overall, just over 1.1M structures are clustered into around 372k clusters
* `--cluster-search 1` parameter is added to `foldseek search` to search against the cluster representatives first and then within each cluster, which increases sensitivity and reduces resource usage compared to `phold v0.1.4`
* Changed default `--max_seqs` from 1000 to 10000 to improve sensitivity at little resource usage cost
* Phold database is expanded adding:
* Extremely conservative high confidence [efam](https://doi.org/10.1093/bioinformatics/btab451) proteins with hits to PHROGs.
* 95% dereplicated diversity-generating retroelements (DGRs) from [Roux et al](https://www.nature.com/articles/s41467-021-23402-7).
* 7153 netflax toxin-antitoxin system proteins from [Ernits et al](https://doi.org/10.1073/pnas.2305393120).
* Adds `--ultra_sensitive` flag which turns off Foldseek prefiltering for maximum sensitivity. Recommended for small datasets/single phages only.
* This passes the `--exhaustive-search` parameter to `foldseek search`
* Adds the ability to save ProstT5 embeddings with `--save_per_residue_embeddings` and `--save_per_protein_embeddings`
* Adds `.cif` support (e.g. from Alphafold3 server) for structures, not just `.pdb` file format
* Removes some experimental parameters from v0.1.4 (`--split` etc)

Breaking CLI parameter changes

* `--pdb` has changed to `--structures`
* `--pdb_dir` has changed to `--structure_dir`
* `--filter_pdbs` has changed to `--filter_structures`

0.1.4

------------------

* Fixes 31 issue with older Pharokka genbank input (prior to v1.5.0) that lacked 'transl_table' field
* All Pharokka genbank input prior to v1.5.0 will be transl_table 11 (it is before pyrodigal-gv was added)
* Fixes genbank parsing bug that would occur if the ID/locus tag of the features in the inout genbank were longer than 54 characters

0.1.3

------------------

* Adds compability with Apple Silicon (M1/M2/M3) GPUs
* Fixes memory issue for `phold plot` with many contigs

0.1.2

------------------

* Fixes `phold compare` cds_id issue where input file was FASTA
* Fixes issues with `phold remote` where input file was FASTA
* Improved documentation with conda/mamba install

0.1.1

------------------

* Restructuring for pip and conda installation
* No substantive changes to the code base

0.1.0

------------------

* Initial beta release

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.