Pyamilyseq

Latest version: v1.1.1

Safety actively analyzes 724206 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

1.1.1

v1.1.1 introduces a number of small fixes and QOL improvements, along with additional auxiliary scripts to help in the interrogation of the resulting clusters and gene families we get from PyamilySeq and other pangenome tools.

**Key Updates:**

*Auxiliary Clustering Tools Added:*

- Introduced compare_gpa.py and compare_trees.py scripts to enhance clustering analysis capabilities.

*Bug Fixes:*

- Resolved various input/output and minor runtime issues to improve overall stability and performance.

**Full Changelog**: https://github.com/NickJD/PyamilySeq/compare/v1.1.0...v1.1.1

1.1.0

- Users can now specific input gff and fasta file affixes
- Numerous additional checks to ensure we don't overwrite existing files & more error handling
- Refined menu parameter names
- Ability to output sequences and alignments for specific groups
- Fixed more issues with gene_presence_absence files

Specifically these changes have now resolved 3 previously raised githut bugs and/or feature requests.

- [[Multiple alignment options with -w](https://github.com/NickJD/PyamilySeq/issues/3)]
- [[Options to specify input file extensions in separate input mode](https://github.com/NickJD/PyamilySeq/issues/4)]
- [[Seq-Combiner will read in and overwrite the output file.](https://github.com/NickJD/PyamilySeq/issues/5)]

Thanks to ecampbell50 for the requests.

0.9.0

- Group-Splitter:
- Replaced calculate similarity with Levenshtein difference.
- Can be calculated using the Levenshtein library or fallback to much slower python implementation.

- Fixed handling of AA/DNA clustering output of CD-HIT.

- Calculating the representative sequence for each new
subgroup now normalises length and pident.

- If reclustering of a Group results in multiple CD-HIT clusters, each cluster will be processed separately. It is therefore important to understand reclustering options.

- Added option to process only user-defined Groups or 'auto' to detect which groups to subgroup.

- Added required user-parameter to state the number of genomes (or genera) in analysis.

- Fixed some minor bugs and cleaned up output handling.

- Fixed a bug where the total_genomes was being calculated on a per-cluster basis which was naive. User must now provide the number of genomes in the analysis.

- Added option to not delete temp files.

- Cleaned up some user-parameters to match those used in CD-HIT.

- General:
- A number of general bug fixes, user-menu improvements, added output during 'verbose' mode and code-cleanup.

- Cluster-Summary:
- A new sub-tool that summarises CD-HIT .clstr files.

0.8.1

Version v0.8.1 includes small but important fixes to the core-gene alignment system and a few user option improvements.

- Group-Splitter now handles CD-HIT .clstr files correctly as there are slight differences between cd-hit and cd-hit-est (aa/dna) in how the pident's are reported.
- Group-Splitter now requires the user to state whether the input is DNA or AA.
- There was a bug where sometimes 'all' FASTA files are included in the core-gene alignment resulting in very odd alignments.
- User options in PyamilySeq now take the -t THREADS parameter and passes it to CD-HIT and MAFFT.

0.8.0

This major release has two main additions but mainly focuses on a new 'subtool' to help process and investigate potential paralogs that have been collapsed into single gene families/groups. More details will be provided in the future with examples.

- Gene_Presence_Absence.csv outputted as Default: The Gene_Presence_Absence file is now automatically generated, making it easier to assess gene distribution across genomes. - This was a suggestion from ecampbell50 2

- Group-Splitter Addition: New 'subtool' to split "paralogous" groups from clustering results, improving handling of gene families with multiple paralogs.

0.7.0

Most changes are reflected in the user menu and a number of bugs were caught that should now result in the correct recording on Seconds when more than one First was clustered in the reclustering stage.

Page 1 of 3

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.