Faster dataset generation and solution traces. No changes to contents of generated datasets.
What's Changed
* New flag `--use-multithreading` for faster phantom-wiki dataset generation https://github.com/kilian-group/phantom-wiki/pull/209. Flag works on linux and windows. Allows concurrent prolog database queries from multiple processes.
* Save solution traces to generated dataset json files https://github.com/kilian-group/phantom-wiki/pull/194.
* Save prolog facts when generating datasets https://github.com/kilian-group/phantom-wiki/pull/241
* Updates to README, python package dependencies https://github.com/kilian-group/phantom-wiki/pull/199, github workflow for pypi publish https://github.com/kilian-group/phantom-wiki/pull/227, huggingface model card https://github.com/kilian-group/phantom-wiki/pull/190, pre-commit config https://github.com/kilian-group/phantom-wiki/pull/208, license https://github.com/kilian-group/phantom-wiki/pull/212
* Cleanup https://github.com/kilian-group/phantom-wiki/pull/228
Evaluation Pipeline Changes (independent of dataset release versions)
* Improve tables and figures for paper, https://github.com/kilian-group/phantom-wiki/pull/186 .
* Add CoT https://github.com/kilian-group/phantom-wiki/pull/214, Nshot-RAG https://github.com/kilian-group/phantom-wiki/pull/181, CoT-RAG, React https://github.com/kilian-group/phantom-wiki/pull/239 support for deepseek-r1.
* Add RAG support for API-based models https://github.com/kilian-group/phantom-wiki/pull/187.
* Refactor files in `src/phantom_eval/` https://github.com/kilian-group/phantom-wiki/pull/201 https://github.com/kilian-group/phantom-wiki/pull/196.
* Refactor bash scripts in `eval/` https://github.com/kilian-group/phantom-wiki/pull/222.
* Testing how LLMs generate prolog queries to answer phantom-wiki questions https://github.com/kilian-group/phantom-wiki/pull/189 https://github.com/kilian-group/phantom-wiki/pull/235.
**Full Changelog**: https://github.com/kilian-group/phantom-wiki/compare/v0.5...v0.5.1