What's changed
This version, we have added more models and released more experiments and evaluation results. Multiple functions have been expanded, including datasets ,APIs and so on , conclude as follows:
1. Newly released and added evaluation results for multiple models including base, lora and qlora, updated in docs/eval_llm_result.md. Models include llama2-7b, 13b, codellama2-7b, 13b, baichuan2-7b, 13b, Qwen7b, 14b, mainly completed by wangzaistone, zhanghy-sketchzh, junewgl, Jian1273 and zhoufan, qidanrui.
2. Newly completed fine-tuning development and training of codellama-13b in the project and released sota weights, mainly by wangzaistone, Jian1273 and zhanghy-sketchzh offering assistance.
3. Updated evaluation methods and results on the testsuit dataset, mainly by wangzaistone, JBoRu and junewgl.
4. Newly reconstructed log output code structure, mainly by wangzaistone and zhanghy-sketchzh.
5. Newly supported adding other datasets during training, mainly by Jian1273, assisted by wangzaistone and John-Saxon.
6. Newly added and improved deepspeed support, by Jian1273, wangzaistone, zhanghy-sketchzh.
7. Newly added workflow, by qidanrui.
8. Newly added API interfaces, including the entire data processing, training, prediction and evaluation process (144), by qidanrui.
9. Summarized everyone's models as baseline results in the API interface (junewgl and qidanrui).
10. Newly added poetry installation and operation methods, by qidanrui.
11. Newly added chatglm3 model support, and released training and evaluation results, by wangzaistone.
12. Continuously maintained Chinese and English documentation including adding related parameters, experimental indicators and data descriptions, grammar checks, etc., mainly by wangzaistone and qidanrui, assisted by zhanghy-sketchzh and Jian1273.
13. Clarified future development directions including interfaces and so on , by csunny.
Thanks to partner John-Saxon for starting to contribute code 83 122 (contributed initial multi-round dialog data code) simonchuzz (contributed initial code for database-assisted training data construction functionality).
Thanks to qidanrui for starting to submit code in this version and improving multiple parts.