Another exciting week with so much improvements by our amazing community. We're thrilled to announce the latest release of GPT Researcher, now featuring evaluations using the SimpleQA dataset by OpenAI. Our rigorous testing has demonstrated an impressive 93% accuracy rate, surpassing all current leading projects in the market.
This achievement underscores the remarkable capabilities of the open-source community, and we're just getting started! In response to extensive feedback, we've refined our deep research functionalities to be faster, smarter, and more cost-effective, while also addressing previous bugs. Update to the latest version and experience the enhancements firsthand!
Here are results of our latest evals run:
Evaluation Summary
-------------------------
Debug counts:
Total successful: 100
CORRECT: 93
INCORRECT: 7
NOT_ATTEMPTED: 1
{
"correct_rate": 0.93,
"incorrect_rate": 0.07,
"not_attempted_rate": 0.01,
"answer_rate": 0.99,
"accuracy": 0.9292929292929293,
"f1": 0.9246231155778895
}
-------------------------
What's Changed
* Fix `Key Error` while using Deep Research by kongacute in https://github.com/assafelovic/gpt-researcher/pull/1188
* Update requirements.txt with missing langgraph dep by namin in https://github.com/assafelovic/gpt-researcher/pull/1189
* Fix Docker Build Failure: Updated `combined_query` in `DeepRsearchSkill.run()` to Handle Backslashes in F-Strings by monolok in https://github.com/assafelovic/gpt-researcher/pull/1192
* stabilize docker & frontend upgrades by ElishaKay in https://github.com/assafelovic/gpt-researcher/pull/1191
* Improved overall planning and research performance by assafelovic in https://github.com/assafelovic/gpt-researcher/pull/1195
* Added support for base_url param in create_chat_completions for OpenAI Provider by gaurav3247 in https://github.com/assafelovic/gpt-researcher/pull/1198
* Update llm.py by olipayne in https://github.com/assafelovic/gpt-researcher/pull/1200
* Fix WebSocket timeout issues by luislofer89 in https://github.com/assafelovic/gpt-researcher/pull/1203
* fix: Add missing langgraph module to requirements.txt by hurxxxx in https://github.com/assafelovic/gpt-researcher/pull/1207
* Refactor: typing cleanup by czakop in https://github.com/assafelovic/gpt-researcher/pull/1187
* add async nodriver scrapper by ewgdg in https://github.com/assafelovic/gpt-researcher/pull/1170
* Add language requirement to resource report prompt by hurxxxx in https://github.com/assafelovic/gpt-researcher/pull/1208
* Feature:eval metrics by kga245 in https://github.com/assafelovic/gpt-researcher/pull/1183
* README for feat(evals): Add SimpleQA evaluation framework and initial results by kga245 in https://github.com/assafelovic/gpt-researcher/pull/1212
* Polish up loose ends based on feedback by ElishaKay in https://github.com/assafelovic/gpt-researcher/pull/1211
New Contributors
* namin made their first contribution in https://github.com/assafelovic/gpt-researcher/pull/1189
* olipayne made their first contribution in https://github.com/assafelovic/gpt-researcher/pull/1200
* luislofer89 made their first contribution in https://github.com/assafelovic/gpt-researcher/pull/1203
* hurxxxx made their first contribution in https://github.com/assafelovic/gpt-researcher/pull/1207
* czakop made their first contribution in https://github.com/assafelovic/gpt-researcher/pull/1187
**Full Changelog**: https://github.com/assafelovic/gpt-researcher/compare/v3.2.2...v3.2.3