Funannotate

Latest version: v1.8.17

Safety actively analyzes 642295 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 13

1.0.2

* update to GFF to TBL parser to catch some "common" errors in GFF files
* added `funannotate iprscan` which will run Docker InterProScan searches or also local searches. It will split the job into chunks and run those in parallel which seems to be a faster way to run InterProScan. By default it will chunk the proteins into 1000 protein bins and then run 4 cpus each up to as many cpus as you give the script.
* fix to docker build (hopefully)
* bug fixes for parsing the ncbi error report, properly outputting which genes are causing errors
* fix for antiSMASH parsing of plantismash data

1.0.1

* Wrote a new GFF to TBL parser to accommodate running `funannotate annotate` on a fasta + GFF file.
* Added COGs output to `funannotate compare`, these annotations are parsed from eggnog-mapper data
* several minor bug fixes

1.0.0

Major update to funannotate with new RNA-seq modules, new database download and management, new gene name/product definition module, many bug fixes.

RNA-seq modules:
1. `funannotate train`: Module will run RNA-seq mediated methods for training of GeneMark/Augustus in gene prediction. It will take single or PE RNA-seq FASTQ files, run Trimmomatic quality trimming, run Trinity-mediated read normalization, run Trinity genome-guided RNAseq assembly, run PASA alignment methods. Output is BAM file, trinity transcripts, and PASA GFF3 for use in `funannotate predict`.
2. `funannotate update`: Module will run PASA mediated gene model updates. It can be run after running train --> predict --> update, which will add UTR models and refine gene models. The script can also be run on a pre-existing GenBank assembly where it will run the `funannotate train` methods (quality trimming, normalization, Trinity, PASA) and then followed by the `update` specific methods to add UTRs, refine models, etc.

`funannotate predict` enhancements:
1. Dropped use of GAG to write NCBI tbl file and wrote functions to do this natively in funannotate --> which was making mistakes on some partial gene models.
2. Simplified NCBI tbl generation and gene model filtering --> only running tbl2asn a single time now as bad gene models are properly filtered (previously a regex search was not working perfectly resulting in some gene models being removed arbitrarily)
3. tRNA gene length filter is now in compliance with NCBI rules (you can safely ignore tbl2asn tRNA gene length warnings --> they will eventually update tbl2asn source code)
4. Numbers of gene models for each "source" are now printed to terminal prior to running Evidence Modeler.
5. Script parses the NCBI error reports and show user which gene models need to be manually fixed, after the tbl file is updated, the GBK output files can be regenerated with the new `funannotate fix` command.

`funannotate annotate` enhancements:
1. Diamond search has replaced Blast wherever possible, results in large increase in speed.
2. HMMer searches are now split across multiple CPUs, results in increase speed.
3. Gene names and product definitions are now parsed from UniProtKb/SwissProt results and EggNog-Mapper results. The product definitions are cross references to a community resource called [gene2product](https://github.com/nextgenusfs/gene2product) which will serve as a database of curated gene product definitions.
4. Native NCBI tbl generation results in proper annotation of partial gene models.
5. Script will parse tbl2asn errors and alert user of gene models that need to be fixed.

New Database Management modules:
1. Environmental variable addition: `FUNANNOTATE_DB` allows user to install databases locally, i.e. in a users home directly on an HPC.
2. `funannotate setup` script has been re-written from scratch to control the databases, keep track of versions, and allow user to update database.
3. `funannotate database` is a new command that shows you currently installed databases.
4. Databases have been trimmed down, occupy ~ 4 GB of space.

I would recommend that all users upgrade. After upgrading, you will need to re-download the databases from scratch. As always, many bugs have been fixed and likely some new ones introduced. Please let me know if you encounter errors.

Docs/Manual/Tutorials will be available soon at http://funannotate.readthedocs.io

0.7.2

* fix bug in `funannotate compare`, string conversion to int failed on a check for number of genes
* added better error message for duplicate locus_tag ids in `funannotate compare`

0.7.1

* fix menu in `funannotate annotate` that still had `--email` as an option -> it is not longer an option, all remotes searches moved to `funannotate remote`
* fix eggnog parsing issue where COG and Description are blank -> this happens if you run `diamond` search with eggnog-mapper. You should run HMM search with the appropriate EggNog database, i.e. for fungi that is the fuNOG database.

0.7.0

funannotate predict
* unified genbank conversion method
* added support for `repeatmasker_species` option
* added support for strain flag for genbank conversion
* improved filtering of problematic gene models


funannotate annotate
* removed all remote searches from script (now `funannotate remote` see below)
* dropped EggNog search, instead `—eggnog` option will parse the results from eggnog-mapper. Eggnog-mapper does a more comprehensive search and provides some more functional annotation information than the simple HMMer search of EggNog 4.5 database
* now outputs a tsv annotation file into the `annotate_results` output folder
* improved functional annotation for Gene and Product names
* added support for strain flag for genbank conversion

funannotate compare
* increased speed of parsing GBK files
* remove EggNog description mapping
* fix links to MEROPS database in html output

funannotate remote
* new sub command that will run remote searches
* currently support Phobius, antiSMASH, and InterProScan
* Note: these searches are a free service, don't abuse them. If you can install these software locally it will significantly decrease your run time. They are included here as some are Linux only and/or setup is very difficult.


funannotate setup
* Eggnog 4.5 database no longer required

Page 6 of 13

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.