Esmecata

Latest version: v0.4.2

Safety actively analyzes 623694 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

0.4.2

Fix

* Issue in `pyproject.toml` and how the package is installed with `pip`.

0.4.1

Fix

* Some issues on the PyPI page of the package (due to change from `setup.py` to `pyproject.toml`).

0.4.0

WARNING:
* change in intermediary files of `clustering` and `annotation` in order to reduce disk space used by EsMeCaTa and the number of operations performed by the methods. Instead of assigning one file per observation name, this new version assigns one file per taxon used by EsMeCaTa. This removes a lot of redundant work that slowed EsMeCaTa and could lead to issue.
* annotation with eggnog-mapper is now the default workflow methods of EsMeCaTa. The previous annotation methods with UniProt has been moved to `annotation_uniprot` and `workflow_uniprot`.


Add

* Add sub-commands `annotation_uniprot` and `workflow_uniprot` to use the old method of protein annotation.
* Add `check` subcommand that performs the first step of EsMeCaTa without downloading the proteomes. This is helpful when you want to have a glimpse on the available knowledge for your dataset.
* Error message if incorrect extension is given as input to esmecata.

Fix

* Missing import in proteomes.
* Github Actions.

Modify

* Modify intermediary files to associate them with taxon name selected by EsMeCaTa instead of the observation name (based on an idea of PaulineGHG). This change replaces tsv files, that were created for each observation names. Now they will be created for each taxon instead. This means that observation names with the same taxon will be associated with the same file. This reduces the redundancy of the file and decreases the number of operations made by EsMeCaTa.
* Modify how the log json files are created so if a run failed, a new log json file is created instead of erasing the previous ones.
* Move from `setup.py` and `setup.cfg` to `pyproject.toml`.
* Update readme and tutorial.
* Update license year.

Remove

* Remove sub-commands `annotation_eggnog` and `workflow_eggnog` which are now the default sub-commands `annotation` and `workflow`.

0.3.0

Add a new way to annotate protein clusters using eggnog-mapper. From test on metagenomcis data, it is more accurate than the methods with UniProt.
Also modify the default option of EsMeCaTa for option with better results on tested data (minimal number of proteomes from 1 to 5 and clustering threshold from 0.95 to 0.5).

Add

* Add a new method to annotate protein clusters using eggnog-mapper: new script `eggnog.py`, new commands `annotation_eggnog` and `workflow_eggnog`.
* Add option to query uniprot dat files during annotation (`--annotation-files`, needs `biopython`>=`1.81`).
* Add an option to use bioservices for annotation queries (`--bioservices`, requires `bioservices`>=`1.11.2`).
* Add more tests for proteomes selection.
* Add an option to update taxonomic affiliations (`--update-affiliations`).
* Show the failedIDs during mapping for annotation.
* Add an option to specify eggnog-mapper tmp fodler (`--eggnog-tmp`). By default, it is in esmecata output folder.
* Add KEGG reaction in annotation_reference file when using eggnog-mapper.
* Add a function to compare Input taxa information to esmecata taxa information (taxa name, taxa ID, taxa rank) + precise OTUs associated. Thanks to PaulineGHG.

Fix

* Do not use already annotated proteins when using annotation files.
* Fix issue in esmecata proteomes, not using non-reference proteome.
* Fix issue with missing reference proteome when parsing SPARQL results.
* Fix issue in main with cli.
* Fix an issue with minimal-nb-proteomes and non-reference proteomes.

Modify

* Modify default options to `--minimal-nb-proteomes` of 5 (from 1 in previous version) and `-t` (clustering threshold from 0.95 to 0.5).
* Modify rank_limit option to make it more understandable. Only taxon ranks inferior or equal to the one given will be kept.
* Remove several output folders (proteomes `result`, clustering `fasta_consensus` and clustering `fasta_representative`) to reduce size of EsMeCaTa results.
* Rename `tmp_proteome` into `proteomes`.
* Remove FROM in SPARQL queries to speed up the queries (could speed up SPARQL queries).
* Change header for annotation files, especially: 'gos', 'ecs', 'interpros', 'rhea_ids' into 'GO', 'EC', 'InterPro', 'Rhea'.
* Add column `cluster_members` in annotation reference file and renamed column `protein` into `protein_cluster`.
* Do not create fasta file when there are no protein clusters.
* Update license year.
* Update esmecata worfklow picture.
* Update the doc of esmecata.

0.2.12

Fix

* esmecata clustering should now avoid reclustering already clustered data.

0.2.11

Modify

* remove already downloaded proteomes from the list of proteomes to download.

Fix

* esmecata annotation should now avoid reannotating already annotated data (as it was not correctly done in previous version).

Page 1 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.