Mid-week bug fixes release with an extra feature: - run_test now works - new function `evaluate`, evaluates a list of test cases (dataset) on metrics you define, all without having to go through the CLI. More info here: https://docs.confident-ai.com/docs/evaluation-datasets#evaluate-your-dataset-without-pytest
0.20.18
In this release, deepeval has added support for:
- JudgementalGPT, a dedicated LLM app developed by Confident AI to perform evaluations more robustly and accurately. JudgementalGPT provides a score and a reason for the score. - Parallel testing: execute test cases in parallel and speed up evaluation up to 100x.