Datasketch

Latest version: v1.6.5

Safety actively analyzes 638361 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 6

1.5.2

* Performance improvement for MinHash's update method.
* Make MinHash updates 4.5X faster by using `update_batch` method for bulk update on MinHash. [See API doc].(http://ekzhu.com/datasketch/documentation.html#datasketch.MinHash.update_batch)
* Further performance gain by using bulk generation of MinHash using `MinHash.bulk` or `MinHash.generator`. See [API doc](http://ekzhu.com/datasketch/documentation.html#datasketch.MinHash.bulk) and [pull request](https://github.com/ekzhu/datasketch/pull/142).
* Optional compression for MinHash LSH index by hashing the bucket key produced by `MinHashLSH._H`. See [pull request](https://github.com/ekzhu/datasketch/pull/143). This leads to saving of memory/storage space used by the index.

Thank you Sinusoidal36!

1.5.0

* Minor bug fixes
* Cassandra storage layer, thank ostefano! Now you can specify the Cassandra config just like the Redis one.

python
from datasketch import MinHashLSH

lsh = MinHashLSH(
threashold=0.5, num_perm=128, storage_config={
'type': 'cassandra',
'cassandra': {
'seeds': ['127.0.0.1'],
'keyspace': 'lsh_test',
'replication': {
'class': 'SimpleStrategy',
'replication_factor': '1',
},
'drop_keyspace': False,
'drop_tables': False,
}
}
)

1.4.0

Now support `hashfunc` parameter for MinHash and HyperLogLog. The old parameter `hashobj` is removed.

python
Let's use MurmurHash3.
import mmh3

We need to define a new hash function that outputs an integer that
can be encoded in 32 bits.
def _hash_func(d):
return mmh3.hash32(d)

Use this function in MinHash constructor.
m = MinHash(hashfunc=_hash_func)


v.1.3.0
Use dynamic programming to create optimal partition, allow LSH Ensemble index to adapt to any set size distribution.

1.2.10

* Adding batch removal functionality for Async MinHashLSH
* Because Redis does not support async operation, removed Redis support from Async MinHashLSH

For details see Pull 70
Thanks aastafiev for the contribution.

1.2.9

Add support for MongoDB replica set

1.2.8

Page 3 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.