Alpaca-eval

Latest version: v0.6.6

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 6

0.6

What's Changed
* [DATA] Add Gemma by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/242
* [NOTEBOOK] adding final length correction notebook. by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/244
* add Mistral-7B-ReMax-v0.1 by liziniu in https://github.com/tatsu-lab/alpaca_eval/pull/245
* [ENH] add claude 3 by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/247
* [ENH] add contextual by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/250
* [ENH] add mistral large by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/251
* Add Samba-CoE-v0.2 to AlpacaEval by kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/253
* Add Samba-CoE-v0.2-best-of-16 to AlpacaEval by kyleliang919 in https://github.com/tatsu-lab/alpaca_eval/pull/256
* Add Mistral-ORPO-Beta to AlpacaEval by jiwooya1000 in https://github.com/tatsu-lab/alpaca_eval/pull/257
* Yann/length correction by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/258

New Contributors
* liziniu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/245
* kyleliang919 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/253
* jiwooya1000 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/257

**Full Changelog**: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.4...v0.6

0.5.4

What's Changed
* Add Qwen1.5-72B-Chat to AlpacaEval by Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/226
* Add claude-instant-1.2, deepseek-llm-67b-chat, wizardlm-70b, Qwen-14B-Chat (config + outputs without annotations) by gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/228
* [DATA] Adding annotations for the arena models by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/229
* Update README.md - Add missing "Y" to "ou" by yoderj in https://github.com/tatsu-lab/alpaca_eval/pull/230
* [DEV] Analyzing length-controlled metrics. by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/231
* [DOC] add annotation interpretation by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/232
* [DATA] add results from the Arena openai models by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/234
* update ELO for llama-2-13b-chat-hf by gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/235
* [NOTEBOOK] add length-corrected GLM by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/237
* [ENH] add inverse mapper to make sure in and out types are the same by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/240
* [ENH] update to allow AF to use AE by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/241

New Contributors
* Lukeming-tsinghua made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/226
* yoderj made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/230

**Full Changelog**: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.3...v0.5.4

0.5.3

What's Changed
* [ENH] add mistral-medium by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/205
* [ENH] add internlm2-chat-20b-ppo by C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/207
* prettify "pretty_name" of internlm2 by C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/208
* [ENH] add outputs & configs form dolphin 2.2.1 by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/209
* Add PairRM 0.4B + Yi-34B-Chat to AlpacaEval 2.0 by jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/210
* dolphin 2.1.1 configs.yaml by gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/212
* Update README.md (small typo) by xwinxu in https://github.com/tatsu-lab/alpaca_eval/pull/213
* [TEST]: fix ordering of df by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/214
* Add Snorkel-Mistral-PairRM-DPO (best-of-16) to Alpaca Eval 2.0 by viethoangtranduong in https://github.com/tatsu-lab/alpaca_eval/pull/215
* update InternLM2 chat template by C1rN09 in https://github.com/tatsu-lab/alpaca_eval/pull/216
* Add Starling-LM-7B-alpha, vicuna-13b-v1.5, vicuna-7b-v1.5 to AlpacaEval (config + outputs without annotations) by gblazex in https://github.com/tatsu-lab/alpaca_eval/pull/217
* [RES] add 3 models for arena correlations by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/218
* Add xwinlm-70b-v0.3 to AlpacaEval by nbl97 in https://github.com/tatsu-lab/alpaca_eval/pull/221
* [ENH] add referenced_models locally by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/224

New Contributors
* C1rN09 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/207
* gblazex made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/212
* xwinxu made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/213
* viethoangtranduong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/215

**Full Changelog**: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.2...v0.5.3

0.5.2

What's Changed
* [BUG] force openai >1.5.0 by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/202
* [WIP] precompute all leaderboard for AE2 by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/199
* [ENH] add OpenHermes by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/203


**Full Changelog**: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.1...v0.5.2

0.5.1

What's Changed
* [BUG] fix no OAI org id set by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/200


**Full Changelog**: https://github.com/tatsu-lab/alpaca_eval/compare/v0.5.0...v0.5.1

0.5.0

What's Changed
* Fix mssg check by Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/174
* Add MiniChat-1.5-3B to AlpacaEval and Fix MiniChat-3B by GeneZC in https://github.com/tatsu-lab/alpaca_eval/pull/176
* Add 01-ai/Yi-34B-Chat to AlpacaEval by HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/175
* feat: add way to verify results by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/177
* show img in readme by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/178
* Add PairRM best-of-16 to AlpacaEval by jdf-prog in https://github.com/tatsu-lab/alpaca_eval/pull/181
* Verify Yi by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/182
* chore: add phi-2 sft by lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/184
* add cut-13b by wwxu21 in https://github.com/tatsu-lab/alpaca_eval/pull/186
* chore: add phi-2 dpo by lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/185
* Support phi2, Support SOLAR 10.7B LMCocktail by yhyu13 in https://github.com/tatsu-lab/alpaca_eval/pull/183
* Update openai.py by Muennighoff in https://github.com/tatsu-lab/alpaca_eval/pull/188
* chore: add link for phi-2-sft by lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/190
* chore: fix links by lxuechen in https://github.com/tatsu-lab/alpaca_eval/pull/191
* Add deita-7b-v1.0 model by VPeterV in https://github.com/tatsu-lab/alpaca_eval/pull/192
* [ENH] Azure OAI client & more general way of switching between client configs by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/193
* [ENH] Weighted win rates by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/189
* [ENH] new models: Gemini / claude2.1 / mistral / mixtral / .. by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/195
* [ENH] alpaca_eval 2.0 by YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/196

New Contributors
* Muennighoff made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/174
* HyperdriveHustle made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/175
* jdf-prog made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/181
* lxuechen made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/184
* wwxu21 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/186
* yhyu13 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/183
* VPeterV made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/192

**Full Changelog**: https://github.com/tatsu-lab/alpaca_eval/compare/v0.3.6...v0.5.0

Page 2 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.