Breaking Changes
- `MetricValue.metric_values` can now include `None` values when using OpenAI-based metrics. Metric values will be `None` if the OpenAI API rejects the request (e.g. due to their moderation policy) or returns an invalid response.
New Features
- Significantly improved all OpenAI-based metrics using chain-of-thought prompting, such as the `factual_consistency` metric
- All OpenAI-based metrics now output a text explanation for each score in `MetricValue.explanations`
- Added factual consistency benchmark datasets to evaluate different metric configurations
- Thresholds are now visualized when plotting metrics like `(toxicity_values > 0.9).scatter()` – thanks Vela-zz!
- Published new tutorial for email generation app
- Updated EN->JA translation model to `Helsinki-NLP/opus-mt-ja-en`
- (Beta) Added basic English text augmentations, full release with documentation coming soon
Bug Fixes
- Fixed "can't find Rust compiler" bug during installation for Python 3.11 due to older versions of the `tokenizers` library
- Temporarily pinned `openai<1.0.0` due to breaking changes in their API
- Fixed contributing guide for zsh – thanks shibuiwilliam!
- Cleaned up `noqa` showing up in documentation