**1. Automatic Ensemble Creation: Implement functionality for automatic creation of evaluation ensembles**
- Train your own RAG evaluation model (*i. e. Gradient Boosting Classifier*) based on the RuRAGE metrics
- Prepare data for the RuRAGE ensemble model training right from the RuRAGE reports
- Train your own RuRAGE ensemble model for any of the evaluation tasks: Correctness, Faithfulness, Relevance
- Save (only basic functionality is implemented) and use trained model on the inference
**2. Auto-adaptive thresholds: Implement functionality for automatic creation thresholds for features in ensemble**
- Automatic ensemble model optimization by selecting an optimal classification threshold(s)
**3. Multiclass Labels: Extend support to work with multiclass labels.**
- Train and inference RAG evaluation model to predict not only binary labels, but also multiclass labels for the Correctness/Faithfulness/Relevance estimation
- Automatic threshold selection supports multiclass labels
**4. Detailed usage examples**
- Notebook for the basic RuRAGE functionality (preparing metric reports for the Correctness, Faithfulness, Relevance tasks)
- Notebook for the RAG evaluation ensemble training and inferencing.
**5. Different bug fixes in the metric calculation**