Expanded Metrics Suite
- **Adversarial Robustness**: Evaluates model resilience to adversarial demonstrations and manipulation attempts.
- **Out-of-Distribution Robustness**: Assesses how well models handle inputs that deviate from the training distribution.
- **Privacy Considerations**: Identifies potential personal information disclosure and sensitive data handling issues.
- **Stereotype Bias**: Detects the presence of social, cultural, gender, and racial stereotyping in model outputs.
These new metrics provide a more comprehensive evaluation of LLM safety, fairness, and reliability.
Enhanced Toxicity Discrimination
- Improvements to the **Toxicity Discriminative** metric for better classification accuracy and configurable thresholds.
- Added support for strict toxicity evaluation mode with a zero threshold.
Visualization Enhancements
- Integration of new metrics into the **interactive visualization dashboard**.
- Expanded chart types and analysis capabilities for the expanded set of evaluation criteria.
---
Recap of Recent Improvements
As a reminder, previous releases included:
- **v0.0.8**: Introduction of the Toxicity Discriminative metric for enhanced safety analysis.
- **v0.0.6**: Resolved issues with MCDA score calculation and JSON response bugs.
- **v0.0.4**: LLM-powered chart interpretation and improved visualization experience.
---
Why Upgrade?
Upgrading to **IndoxJudge v0.0.9** provides you with:
- New **safety, fairness, and robustness metrics** for more comprehensive LLM evaluation.
- Enhancements to the **Toxicity Discriminative** feature for better content moderation.
- Expanded visualization capabilities to gain deeper insights into model performance.
- All the improvements from previous releases.
These enhancements make **IndoxJudge** an even more powerful and versatile tool for your LLM evaluation needs.
---
How to Upgrade
Upgrade to the latest version of IndoxJudge using the following command:
bash
pip install --upgrade indoxjudge
We recommend all users upgrade to **v0.0.9** to benefit from the new features and improvements.
---
Feedback and Support
We appreciate your feedback in helping us improve **IndoxJudge**. If you have any issues or suggestions, please don't hesitate to open a GitHub issue or contact our support team.
Thank you for your continued trust in **IndoxJudge**. We're committed to providing you with the best tools for LLM evaluation and look forward to your feedback on this latest release.
**Happy evaluating!**
*The Indox Team*