Highlights
This release is mainly a bunch of bug fixes:
- Pulling in breaks in upstream dependencies (e.g. Pydantic 2.10, aviary 0.10.1)
- Makes `GradablePaperQAEnvironment`'s evaluations robust to an empty answer or multiple answers
Due to the introduction of `Complete.NO_ANSWER_PHRASE` in https://github.com/Future-House/paper-qa/pull/726 it was requested we consider this a minor version bump, as it will impact system performance.
What's Changed
* Fixed settings `session` into `EnvironmentState`, and suppressing PyMuPDF derived `DeprecationWarning` by jamesbraza in https://github.com/Future-House/paper-qa/pull/713
* Adding assertion `gather_evidence` doesn't populate `session.answer` by jamesbraza in https://github.com/Future-House/paper-qa/pull/716
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/715
* Fixes `gather_with_concurrency` typing by maykcaldas in https://github.com/Future-House/paper-qa/pull/714
* Latest tooling dependencies by jamesbraza in https://github.com/Future-House/paper-qa/pull/719
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/718
* Fixed `EVAL_PROMPT_TEMPLATE` to handle empty string or multiple match answers by jamesbraza in https://github.com/Future-House/paper-qa/pull/724
* Address missing `GenerateAnswer` in trajectories, no answers after `Complete` tools, and better history by mskarlin in https://github.com/Future-House/paper-qa/pull/726
* Pulling in latest `aviary` for `concurrency` rename by jamesbraza in https://github.com/Future-House/paper-qa/pull/728
* Pulling in latest `aviary` for dependencies fix, and retrying flaky `test_propagate_options` more by jamesbraza in https://github.com/Future-House/paper-qa/pull/729
* Pulling in latest `ldp` for `Callback.before_rollout` by jamesbraza in https://github.com/Future-House/paper-qa/pull/734
* Documenting why we don't handle evaluation failures in `GradablePaperQAEnvironment.step` by jamesbraza in https://github.com/Future-House/paper-qa/pull/738
* Created `LitQAEvaluation.calculate_accuracy_precision` utility by jamesbraza in https://github.com/Future-House/paper-qa/pull/733
* Refreshed test cassettes, fixed flaky test `test_search`, and fixed test type ignores by jamesbraza in https://github.com/Future-House/paper-qa/pull/739
* Unpins pydantic >2.10.2 requirement, removes TYPE_CHECKING by nadolskit in https://github.com/Future-House/paper-qa/pull/725
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/737
* Alternative maybe is text by loesinghaus in https://github.com/Future-House/paper-qa/pull/717
New Contributors
* maykcaldas made their first contribution in https://github.com/Future-House/paper-qa/pull/714
* loesinghaus made their first contribution in https://github.com/Future-House/paper-qa/pull/717
**Full Changelog**: https://github.com/Future-House/paper-qa/compare/v5.5.0...v5.6.0