Clayrs

Latest version: v0.5.1

Safety actively analyzes 623586 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.5.1

<p align="center">
<img src="https://user-images.githubusercontent.com/26851363/222411280-a1c33b5f-e19b-4e11-abf3-b3fe6cb73e52.svg" alt="ClayRS can see"/>
</p>

Release which includes ***image support*** for the `Content Analyzer` and `RecSys` modules!
* This release was co-developed with m-elio

***NOTE:*** The minimum *Python* version has been bumped up from ***Python 3.7*** to ***Python 3.8*** in order to use `functools.cached_property` decorator

---

Added

Content Analyzer

* Implemented visual preprocessors thanks to `torchvision` library
* Also `torch` augmenters were implemented
* All of them can be checked in the [docs](https://swapuniba.github.io/ClayRS/content_analyzer/information_preprocessors/visual_preprocessors/torch_preprocessors/)
* Implemented postprocessors techniques which also work for textual techniques
* Visual bag of words (with count and tfidf weighting schema)
* Scipy vector quantization
* Dimensionality reduction techniques from `sklearn` (PCA, Gaussian random projections, Feature agglomeration)
* Images path to process specified in the raw source could be an *absolute_path*, *relative_path*, *online url*!
* Implemented several content techniques which extract embedding features from images
* Pre-trained models from `timm`
* Pre-trained caffe models using `opencv.dnn`
* Hog descriptor, Canny edge detector, LBP, SIFT from `skimage`
* Color histogram
* Custom filter convolution
* Implemented `FromNPY` technique, which imports features from a numpy serialized matrix

RecSys

* Implemented VBPR technique following the [corresponding paper](https://cseweb.ucsd.edu/~jmcauley/pdfs/aaai16.pdf)
* The implementation has been tested **thoroughly** by experimental comparison with [cornac](https://github.com/PreferredAI/cornac) (experiment repository can be found [here](https://github.com/Silleellie/VBPR-Reproducibility))

Changed

Content Analyzer

* Changed `Ratings` class to use numpy arrays and integer mappings instead of relying on python dictionaries and strings
* Adapted `FieldContentProductionTechnique` to consider the distinction between *textual* and *visual* techniques
* Added possibility to serialize contents produced with multi threading

RecSys

* **Vectorized** computation of `CentroidVector` algorithm
* Adapted *content based algorithm* abstraction to make room for neural algorithms
* Fixed missing Bootstrap partitioning technique from online documentation
* `AllItemsMethodology` by default now considers as items catalog the union between train and test set
* `HoldOutPartitioningTechnique` can now accept an integer value representing the n° of instances to hold rather than a percentage
* Changed log of users skipped in partitioning/algorithm fitting: a single print with total number of skipped users is fired instead of a single one for each skipped user

EvalModel

* Changed `NDCG` implementation to allow the choice of the `gain` weights (`linear` or `exponential`) and the definition of a `discount` function
* Improved visualization of statistical tests results

0.4.1

Added

- ClayRS can be now installed to ***Python 3.10*** thanks to m-elio!
- Support is still guaranteed for *Python 3.7 - 3.8 - 3.9*

0.4.0

Added

- Implemented **different weighting schemas** for tuning the computation of the _personalized page rank_!
- Check [documentation](https://swapuniba.github.io/ClayRS/recsys/graph_based/graph_based_algorithms/nx_pagerank/)
- Added the possibility to perform a complete experiment for a specific list of users

Changed
- Fixed missing `__repr__` method to `MAP` metric
- Changed default number of cpus from 0 (all available) to 1 (single core) for the `Experiment` class in order to overcome, for naïve users, serialization limitation of python multiprocessing mechanism

0.3.1

Changed

- The query endpoint to _DBPedia_ ontology has been changed from the one provided by [factforge](http://factforge.net/sparql) to the [virtuoso one](https://dbpedia.org/sparql/)
- SPARQL query definition has been changed in order to improve response times
- From ~20 min to **~6min** to retrieve properties for all _1682_ movies of the movielens 100k dataset
- Changed default number of cpus from 0 (all available) to 1 (single core) for the recsys phase in order to overcome, for naïve users, serialization limitation of python multiprocessing mechanism

0.3.0

Added

- Implemented `ContentBasedExperiment` and `GraphBasedExperiment` classes that lets you easily compare different algorithms with a simple interface!
- Check [documentation](https://swapuniba.github.io/ClayRS/recsys/experiment/)

Changed
- Fixed missing `__str__` method to some *Content Based algorithms*
- Fixed error when property should be loaded in the graph but some items are missing from local
- Set default `n_recs` parameter for the rank methods of each RecSys to **10**
- Fixed `NoneType` error for `fit()`, `fit_rank()`, `fit_predict()` methods of Content Based when the algorithm can't be fit for a user
- Fixed `Exception never retrieved` log of asyncio that appeared in only specific environments
- Re-organized internal module structure and imports for avoiding and preventing circular imports

0.2.1

Changed

- This is a **hotfix** for duplicate logging which may happen in specific scenarios (e.g. _Google Colab_)

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.