Sourmash

Latest version: v4.8.12

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 10

4.2.2

Major new features:

* added functionality to recover original k-mers given hashes - `sourmash sig kmers` et al. (1653, 1695, 1701)

Documentation updates:

* Updated picklist docs (1683)
* Updated the 'how to release' doc after 4.2.0 release (1649)

Minor new features:

* Adjusted dayhoff and hp encodings to tolerate stop codons in the protein sequence (1673)

Bug fixes and performance improvements:

* Fixed panic bug in `sourmash sketch` dna with bad input and `--check-sequence` (1702)

Refactoring and cleanup:

* Changed `sourmash compute` to `sourmash sketch` in `tests/test_sourmash.py` (1680, 1687)
* Tested and fixed `sourmash_args.load_many_signatures(...)` and `lca_db.load_single_database` (1684)

4.2.1

This is a bug-fix and performance release of sourmash.

There are no major new features.

`git log --oneline v4.2.0..latest`

Minor new features:

- new picklist coltypes for directly using `gather`, `prefetch`, and `manifest` outputs without specifying column name (1660)
- add `--from-file` to `sig cat` (1657)
- implement a lazy/on-demand `Index` loading class to support low memory tracking of a large index (1661)
- add `sourmash tax prepare` to build SQLite taxonomy databases for use with `tax` commands(1651)
- Support manifests in `MultiIndex` (1654)
- `tax` summarization additions and fixes, including reporting bp and unclassified (1667)
- add `--from-file`, improved sig selection to most `sig` commands (1672)


Bug fixes and performance improvements:

- fix bug in `gather` when run with `scaled=1` (1670)


Documentation updates:

- Add sourmash-bio/community Gitter badge to README (1658)


Refactoring and cleanup:

- add tests for `sourmash tax` `--containment-threshold` arg (1666)
- fix `sourmash tax` usage string (1655)
- add bounds checking for `--scaled` (1650)

Rust interface:

- Rust Core update (tag: r0.11.0) (1643)

4.2.0

This release adds several significant features: first, we've added a set of [`taxonomy` command-line functionality](https://sourmash.readthedocs.io/en/latest/command-line.html#sourmash-tax-subcommands-for-integrating-taxonomic-information-into-gather-results) for combining `sourmash gather` output with taxonomy databases, and we've also added a new "picklist" feature that enables [flexible selection of _subsets_ of databases](https://sourmash.readthedocs.io/en/latest/command-line.html#using-picklists-to-subset-large-collections-of-signatures). Finally, we've added manifests to databases to support picklists as well as faster database loading and signature selection.

As of this release, we've also formally moved development over to the [sourmash-bio organization](https://github.com/sourmash-bio/) on GitHub, and we've created a new gitter support channel, [sourmash-bio/community](https://gitter.im/sourmash-bio/community#notifications). Please join us there if you have any questions, comments, or feature requests!

Major new features:
* add `tax/taxonomy` submodule (1543, 1628, 1630, 1648)
* add picklists for subsetting databases and results (1587, 1588, 1623, 1590, 1639)
* Add manifests to support fast `Index.select(...)` and lazy loading (https://github.com/sourmash-bio/sourmash/pull/1590)

Documentation updates:
* Add new GTDB databases description to docs and start legacy databases page (1581)
* Change `dib-lab/` URLs to new `sourmash-bio/` URLs. (1629)
* Add notice for sustainable open source study (1580)

Minor new features:
* alias `--nucleotide`, `--no-nucleotide` for moltype args. (1632)
* add signature names to known/unknown hash sigs output by `sourmash prefetch` (1646)

Bug fixes and performance improvements:
* Speed up `sourmash gather` with prefetch by ignoring unidentifiable hashes (1613)
* Check for `MinHash` compatibility in `MinHash.intersection_and_union(...)` (1627)
* Fix selection w/abund and manifest column type conversions (1645)

Refactoring and cleanup:
* Fix Rust 1.59 lints (1600)
* Minor cleanup in `sourmash_args` & `sig` submodules (1586)
* Minor cleanup in minhash module (1585)
* Fix needless borrows as suggested by clippy (1636)

4.1.2

This is a bug-fix and performance release of sourmash.

There are no major new features.

Minor new features:
* add query info to gather CSV output (1565)

Bug fixes and performance improvements:

* Improved `MinHash.remove_many(...)` performance by five orders of magnitude (1571)
* Fix SBT index saving bug that arbitrarily replaced names (but not content) of identical signatures in `.sbt.zip` files (1568)
* Empty zipfiles should not cause `AssertionError` (https://github.com/dib-lab/sourmash/pull/1546)

Major refactoring and new internal functionality:
* update `MinHash.set_abundances` to remove hash if 0 abund; handle negative abundances (1575)

Refactoring and cleanup:
* Fix tests that fail to close files that they open (1550)
* Add "&" and " | " as alternate syntax for MinHash intersection merge (1533)
* Fix missing bracket in docs (1566)
* Updates for coverage tracking (1558)
* Provide a .copy() method for both `SourmashSignature()` and `MinHash` (1551, 1570)

4.1.1

This release fixes a minor bug, provides some refactorings, and dramatically decreases memory consumption for `sourmash gather --linear` (which is, admittedly, a niche use case :).

No major new features.

Bug fixes and performance improvements:

* Unload data with `sourmash gather --linear` on SBTs (https://github.com/dib-lab/sourmash/pull/1534)
* Fix `sourmash gather --no-prefetch` when used w/abund signatures (1528)
* Fix `sourmash index` to not create directory for .sbt.zip output (1539)

Major refactoring and new internal functionality:

* Add `FrozenMinHash` to better support separation of frozen and mutable data actions (1508)

Refactoring and cleanup:
* Improved error handling and testing for pathlist loading (1469)
* Updated some tests to use `sourmash sketch` instead of `sourmash compute` (1536)
* Refactor `sourmash lca summarize` to remove unnecessary if statements, improve tests (1540)

4.1.0

This release provides several convenient features for users, including zipfile collections on input and output and a new `prefetch` command. `sourmash gather` has also received a considerable speed/memory upgrade (twice as fast, 80-90% lower memory). You should upgrade! As a reminder, v4.x has several incompatibilities with v3.x, and if you are upgrading from v3.x you should consult [our migration guide](https://sourmash.readthedocs.io/en/latest/support.html#id12).

Major new features:

* Support zipped collections of signatures (1349)
* Refactor `gather` functionality for speed & modularity (1370, 1512, 1513)
* Provide new command, `prefetch`. (1370)
* Add flexible & iterative support for outputting signatures in variety of collection formats - directories, zipfiles, etc. (1493)
* Add `max_containment` to API and `--max-containment` to command line (1346)
* Add `--from-file` option to `sourmash sketch` commands (1362)

Bug fixes that break backwards compatibility:

* Require scaled signatures for containment (1381)
* Fix CSV output for `sourmash lca classify` when `.name` is empty (1401)
* Really old SBTs (pre-v2.0) no longer load (v1 and v2 SBTs) (changed in 1392)

Other bug fixes:

* Add proper newline output for csv module (1319) - important for Windows!

Other new features:

* `--best-only` searches now work for both similarity AND containment (fixed in 1392)
* `sourmash categorize` now takes all database types
* add `--name` to `sourmash sig merge` (1480)
* decline to load really large files for LCA databases if they're not valid JSON (1495)

Major refactoring and new internal functionality:

* Add a `MultiIndex` class that wraps multiple `Index` classes (1374)
* Refactor and dramatically simplify database loading and compatibility checking (1406, 1420)
* Rework the `find` functionality for `Index` classes (1392, 1477).
* Improved intersection and union calculations (1475)

Documentation enhancements:

* Update the sourmash `__init__.py` docstring, provide `__all__` for imports (1364)
* Add '-h/--help' usage instructions to 'sourmash sketch' CLI (1400)
* Add ORCID to contribution checklist (1405)
* Add information about updating the developer environment to the developer docs (1432)
* Docs: Partial fix for doc build issues with notebooks (1516)

Refactoring and cleanup:

* Refactor the database loading code in `sourmash_args` (1373, 1380)
* Pin needletail version to keep MSRV at 1.37 (1393)
* Rename `load_file_list_of_signatures` to `load_pathlist_from_file` (1423)
* Update call to notify in `src/sourmash/search.py` with f-strings (1422)
* Bump MSRV to 1.42 (and other dep fixes) (1461)
* CI/Rust: update and fix cbindgen config (1473)
* Refactor MinHash.downsample (1458)
* Make `MinHash.downsample(...)` require keyword arguments & fix newly revealed buggy test. (1448)
* Add a check for LCA database error text in`tests/test_lca.py` (1445)
* pin docutils version to last working (1444)
* add codecov configuration to fix paths (1422, 1449)
* provide new test fixtures for cleaner testing (1487)
* Fix small papercuts: SyntaxWarning and coverage reports (1488)
* Clean up clippy lints from 1.52 (1505)
* Bump docutils from 0.16 to 0.17.1 (1499)
* Update myst-parser requirement from ~=0.13.7 to >=0.13.7,<0.15.0 (1520)
* replace utils.TempDirectory with runtmp in some tests (1502)

Page 5 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.