Paper-qa

Latest version: v5.5.0

Safety actively analyzes 681812 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 21

5.5.0

Highlights

In all of v5 before this release, we defined the presence of 1+ answer generations not containing the substring `"cannot answer"` as the agent loop's end. However, this (suboptimally) leads to the agent loop terminating early on partial answers like "Based on the sources provided, it appears no one has done x." We realized this, and have resolved this issue by:

- No longer coupling our done condition with the substring `"cannot answer"` being not present in 1+ generated answers
- No longer implicitly depending on clients mentioning this `"cannot answer"` sentinel in the input `qa` prompt

We also fixed several (bad) bugs:

- We support parallel tool calling (2+ `ToolCall`s in one `action: ToolRequestMessage`). However, our tools (notably `gather_evidence`) are not actually concurrent-safe. Our tool schemae instructed not to call certain tools in parallel, nonetheless we observed agents specifying `gather_evidence` to be called in parallel. So now we force our tools to be non-concurrently executed to work around this race condition
- When using `LitQAEvaluation` and the same `GradablePaperQAEnvironment` 2+ times, we repeatedly added the "unsure" option to the target multiple choice question, degrading performance over time
- When using `PaperQAEnvironment` 2+ times, each `reset` was not properly wiping the `Docs` object
- The reward distribution of `LitQAEvaluation` was mixing up "unsure" reward of `0.1` with the "incorrect" reward of `-1.0`, not properly incentivizing learning

There are a bunch of other minor features, cleanups, and bugfixes here too, see the full list below.

What's Changed

* Deprecation cycle for `AgentSettings.should_pre_search` by jamesbraza in https://github.com/Future-House/paper-qa/pull/679
* Moved agent prompts to `prompts.py` by jamesbraza in https://github.com/Future-House/paper-qa/pull/681
* Refactor to remove `skip_system` from `LLMModel.run_prompt` by jamesbraza in https://github.com/Future-House/paper-qa/pull/680
* Resolving `evidence_detailed_citations` and `Answer` deprecations by jamesbraza in https://github.com/Future-House/paper-qa/pull/682
* Fixed agent prompt names and contents after 681 mess up by jamesbraza in https://github.com/Future-House/paper-qa/pull/683
* Removed `tool_names` validation for `gen_answer` being present by jamesbraza in https://github.com/Future-House/paper-qa/pull/685
* Fixing `test_evaluation` logic bugs by jamesbraza in https://github.com/Future-House/paper-qa/pull/686
* Removed `GenerateAnswer.FAILED_TO_ANSWER` as its unnecessary by jamesbraza in https://github.com/Future-House/paper-qa/pull/691
* Allowing serialized `Settings` in `get_settings` by jamesbraza in https://github.com/Future-House/paper-qa/pull/688
* Fixed LDP runner's `TRUNCATED` not calling `gen_answer`, and documented `AgentStatus` by jamesbraza in https://github.com/Future-House/paper-qa/pull/690
* Removed `gen_answer`'s dead argument `question` by jamesbraza in https://github.com/Future-House/paper-qa/pull/689
* Making sure we copy distractors by sidnarayanan in https://github.com/Future-House/paper-qa/pull/694
* Created `complete` tool to allow unsure answers by jamesbraza in https://github.com/Future-House/paper-qa/pull/684
* Added missing `test_from_question` cassette by jamesbraza in https://github.com/Future-House/paper-qa/pull/696
* Moved `fake` agent to LLM propose `complete` tool by jamesbraza in https://github.com/Future-House/paper-qa/pull/695
* Default to ordered tool calls, w env variable control by mskarlin in https://github.com/Future-House/paper-qa/pull/697
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/699
* Refactored `TestGradablePaperQAEnvironment` for DRY code by jamesbraza in https://github.com/Future-House/paper-qa/pull/702
* Fixing `PaperQAEnvironment.reset` respecting `mmr_lambda` and `text_hashes` by jamesbraza in https://github.com/Future-House/paper-qa/pull/703
* Removed `"cannot answer"` literals and added `reset` tool by jamesbraza in https://github.com/Future-House/paper-qa/pull/698
* Update all non-major dependencies by renovate in https://github.com/Future-House/paper-qa/pull/705
* Fixing `LitQAEvaluation` bugs: incorrect reward indices, not using LLM's native knowledge by jamesbraza in https://github.com/Future-House/paper-qa/pull/708
* Adding filters to paper-qa Docs by whitead in https://github.com/Future-House/paper-qa/pull/707
* Fixed mutably defaulted `NumpyVectorStore.texts` by jamesbraza in https://github.com/Future-House/paper-qa/pull/711

**Full Changelog**: https://github.com/Future-House/paper-qa/compare/v5.4.0...v5.5.0

5.4.0

What's Changed

* Renamed to PQASession type by whitead in https://github.com/Future-House/paper-qa/pull/653
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/657
* Ability to zero-shot `gen_answer` by jamesbraza in https://github.com/Future-House/paper-qa/pull/658
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/659
* Moving to `uv` dependency groups by jamesbraza in https://github.com/Future-House/paper-qa/pull/660
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/664
* Convert citation to formatted_citation usage where necessary by mskarlin in https://github.com/Future-House/paper-qa/pull/666
* Catch edge case where externalIds field is None by mskarlin in https://github.com/Future-House/paper-qa/pull/668
* Made o1 temperature issue a warning, instead of valueerror by whitead in https://github.com/Future-House/paper-qa/pull/669
* Added train and eval splits' questions and DOIs by jamesbraza in https://github.com/Future-House/paper-qa/pull/662
* `fake` agent allowing timeouts or exceptions, by jamesbraza in https://github.com/Future-House/paper-qa/pull/672
* Optional `AnswerSetting.max_answer_attempts` to allow a new unsure branch by jamesbraza in https://github.com/Future-House/paper-qa/pull/673
* Made it so you do not die on invalid tool by whitead in https://github.com/Future-House/paper-qa/pull/670
* Allowing latest `pydantic-settings` and regenerated cassettes by jamesbraza in https://github.com/Future-House/paper-qa/pull/674
* Empty tool calls leading to `done` condition by jamesbraza in https://github.com/Future-House/paper-qa/pull/671
* Changed it to be debug for source quality by whitead in https://github.com/Future-House/paper-qa/pull/675

**Full Changelog**: https://github.com/Future-House/paper-qa/compare/v5.3.2...v5.4.0

5.3.4

Prevents parallel tool calls from clobbering the env. state.

5.3.3

**Full Changelog**: https://github.com/Future-House/paper-qa/compare/v5.3.2...v5.3.3

5.3.2

What's Changed

* Printing the `text` in a failed `llm_parse_json` by jamesbraza in https://github.com/Future-House/paper-qa/pull/629
* Change S2 client logic to use arxiv doi if it's defined by mskarlin in https://github.com/Future-House/paper-qa/pull/632
* Increased retry count for `ClientConnectorDNSError` errors by jamesbraza in https://github.com/Future-House/paper-qa/pull/639
* Make string similarity case insensitive by default by mskarlin in https://github.com/Future-House/paper-qa/pull/640
* Pulling in latest `fhaviary`, `mypy`, `ruff` by jamesbraza in https://github.com/Future-House/paper-qa/pull/647
* Add an after model validator ensuring temp=1 for o1 models by dakoner in https://github.com/Future-House/paper-qa/pull/649
* Fixing crash due to `None` author by jamesbraza in https://github.com/Future-House/paper-qa/pull/650
* Fixing flaky test `test_minimal_fields_filtering` by jamesbraza in https://github.com/Future-House/paper-qa/pull/651
* Fixing flaky tests `test_code` and `test_minimal_fields_filtering` by jamesbraza in https://github.com/Future-House/paper-qa/pull/652
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/648

New Contributors

* dakoner made their first contribution in https://github.com/Future-House/paper-qa/pull/649

**Full Changelog**: https://github.com/Future-House/paper-qa/compare/v5.3.1...v5.3.2

5.3.1

What's Changed
* Exposed `LDPRolloutCallback` and `on_agent_action_callback` for `fake` agent by jamesbraza in https://github.com/Future-House/paper-qa/pull/612
* Fixed `NumpyVectorStore.__eq__`'s `NotImplemented` case by jamesbraza in https://github.com/Future-House/paper-qa/pull/613
* Implemented `__deepcopy__` on all `Environment`s by jamesbraza in https://github.com/Future-House/paper-qa/pull/614
* Made embedding default by whitead in https://github.com/Future-House/paper-qa/pull/615
* Lock file maintenance by renovate in https://github.com/Future-House/paper-qa/pull/617
* Pulled in latest PyMuPDF for `set_messages` by jamesbraza in https://github.com/Future-House/paper-qa/pull/618
* Fixed crash due to DOI being a `list` by jamesbraza in https://github.com/Future-House/paper-qa/pull/619
* Added configuration to adjust how contexts are displayed by whitead in https://github.com/Future-House/paper-qa/pull/620
* Fixing CI by regenerating `test_pdf_reader_match_doc_details` cassette by jamesbraza in https://github.com/Future-House/paper-qa/pull/625
* Retrying flaky test `test_propagate_options` by jamesbraza in https://github.com/Future-House/paper-qa/pull/626
* Regression protection in `embedding_model_factory` by jamesbraza in https://github.com/Future-House/paper-qa/pull/622
* Added `writer.wait_merging_threads` call by jamesbraza in https://github.com/Future-House/paper-qa/pull/628
* Caching opened `tantivy.Index`es in the package by jamesbraza in https://github.com/Future-House/paper-qa/pull/627


**Full Changelog**: https://github.com/Future-House/paper-qa/compare/v5.3.0...v5.3.1

Page 1 of 21

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.