This release contains many new features! Of particular note:
* sourmash now estimates and outputs average nucleotide identity (ANI) based on k-mer measures;
* `sourmash sketch translate` is no longer unusably slow;
* we provide Mac OS 'arm64' wheels for the new M1 Macs;
* we've added a number of support features for managing large collections of signatures and building very large databases;
* and we've added support for SQLite databases that can be used for storing and searching signatures and doing Kraken-style LCA analysis of genomes and metagenomes.
In addition, we have built updated Genbank genome databases (with contents from March 2022) as well as GTDB R07-RS207 databases; see [the prepared databases page](https://sourmash.readthedocs.io/en/latest/databases.html). We've also made some benchmarks available for these databases, so you can get some idea of the necessary computational resources for your searches.
Last but by no means least, we have begun providing a number of examples and recipes for using sourmash - see the new [sourmash examples](https://sourmash-bio.github.io/sourmash-examples/) Web site!
---
Major new features:
* add ANI output to search, prefetch, and gather (1934, 1952, 1955, 1966, 1967, 2011, 2031, 2032)
* new GTDB and Genbank database releases (2013, 2038)
* provide macos arm64 wheels (1935)
* support for SQLite databases (1808)
* implement `sourmash sketch fromfile` (1884, 1885, 1886, 2009)
* add `sourmash sig check` for comparing picklists and databases (1907, 1915, 1917)
* add `sig collect` command (2036) for building standalone manifests from many databases
* Add direct loading of manifest CSVs as sourmash indices (1891)
* add `-A/--abundance-from` to `sig subtract` & add `sig inflate` (1889)
* advanced database format documentation (2025)
Minor new features:
* add `-d/--debug` to `sourmash sig describe`; upgrade output errors. (1782)
* add `sum_hashes` to `sourmash sig describe` output. (1882)
Bug fixes:
* catch TypeError in search w/abund vs flat at the command line (1928)
* speed up `SeqToHashes` `translate` (1938, 1946)
Cleanup and documentation fixes:
* better handle some pickfile errors (1924)
* remove unnecessary downsampling warnings (1971)
* use same wording for dayhoff/hp as for dna/protein (1929)
* rename `covered_bp` property to better reflect function (2050)
Developer updates:
* provide "protocol" tests for `Index`, `CollectionManifest`, and `LCA_Database` classes (1936)
* remove khmer CI tests (1950)
* Benchmarks for seq_to_hashes in protein mode (1944)
* add some tests for Jaccard output ordering (1926)
* Oxidize ZipStorage (1909)
* cleanup and commenting of `test_index.py` tests. (1898, 1900)
* rationalize `_signatures_with_internal` (1896)
* Convert nix to flakes (1904)
* fix docs build (1897)
* Fix build/CI and unused imports papercuts (1974)
* fix hypothesis CI (2028)
* dependabot version updates (1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1991, 1993, 1994, 1995, 1996, 1997, 1998, 2017, 2019, 2020, 2021, 2022, 2023, 2042)