<p align="center">
<img src="https://user-images.githubusercontent.com/26851363/222411280-a1c33b5f-e19b-4e11-abf3-b3fe6cb73e52.svg" alt="ClayRS can see"/>
</p>
Release which includes ***image support*** for the `Content Analyzer` and `RecSys` modules!
* This release was co-developed with m-elio
***NOTE:*** The minimum *Python* version has been bumped up from ***Python 3.7*** to ***Python 3.8*** in order to use `functools.cached_property` decorator
---
Added
Content Analyzer
* Implemented visual preprocessors thanks to `torchvision` library
* Also `torch` augmenters were implemented
* All of them can be checked in the [docs](https://swapuniba.github.io/ClayRS/content_analyzer/information_preprocessors/visual_preprocessors/torch_preprocessors/)
* Implemented postprocessors techniques which also work for textual techniques
* Visual bag of words (with count and tfidf weighting schema)
* Scipy vector quantization
* Dimensionality reduction techniques from `sklearn` (PCA, Gaussian random projections, Feature agglomeration)
* Images path to process specified in the raw source could be an *absolute_path*, *relative_path*, *online url*!
* Implemented several content techniques which extract embedding features from images
* Pre-trained models from `timm`
* Pre-trained caffe models using `opencv.dnn`
* Hog descriptor, Canny edge detector, LBP, SIFT from `skimage`
* Color histogram
* Custom filter convolution
* Implemented `FromNPY` technique, which imports features from a numpy serialized matrix
RecSys
* Implemented VBPR technique following the [corresponding paper](https://cseweb.ucsd.edu/~jmcauley/pdfs/aaai16.pdf)
* The implementation has been tested **thoroughly** by experimental comparison with [cornac](https://github.com/PreferredAI/cornac) (experiment repository can be found [here](https://github.com/Silleellie/VBPR-Reproducibility))
Changed
Content Analyzer
* Changed `Ratings` class to use numpy arrays and integer mappings instead of relying on python dictionaries and strings
* Adapted `FieldContentProductionTechnique` to consider the distinction between *textual* and *visual* techniques
* Added possibility to serialize contents produced with multi threading
RecSys
* **Vectorized** computation of `CentroidVector` algorithm
* Adapted *content based algorithm* abstraction to make room for neural algorithms
* Fixed missing Bootstrap partitioning technique from online documentation
* `AllItemsMethodology` by default now considers as items catalog the union between train and test set
* `HoldOutPartitioningTechnique` can now accept an integer value representing the n° of instances to hold rather than a percentage
* Changed log of users skipped in partitioning/algorithm fitting: a single print with total number of skipped users is fired instead of a single one for each skipped user
EvalModel
* Changed `NDCG` implementation to allow the choice of the `gain` weights (`linear` or `exponential`) and the definition of a `discount` function
* Improved visualization of statistical tests results