New module for informative subset selection, currently implementing 3 strategies : - random selection - selection based on nearest neighbors (Cai et al. 2016) - selection based on Shannon entropy (Chen and Jin 2020) These methods reduce the size of the query space by focusing on informative points, leading to a faster runtime of the query strategies. They can also serve as a warm start for neighborhood-based methods.
0.3
- Added random sampling of triplet constraints
0.2
- Now with actual documentation at [ReadTheDocs](https://scikit-query.readthedocs.io/en/latest/index.html) ! - fit now takes exactly four 4 arguments : data, oracle, partition and number of clusters. The latest two are optional. This shouldn't change anymore to avoid breaking compatibility with previous versions. - Significantly improved performance of FFQS and MinMax - FFQS and MinMax can now be initialized with a precomputed pairwise distance matrix of the data to avoid unnecessary recomputations - Added automatic computation of epsilon threshold in AIPC - All implementations can now write selected constraints in a text file
0.1.1
- Added options for choice between standard and incremental settings in FFQS, MinMax and NPU.
0.1
- Added random sampling, FFQS, MinMax, NPU, AIPC, SASC algorithms. - Jupyter notebook giving a use case of the library.