Lm-eval

Latest version: v0.4.5

Safety actively analyzes 683530 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 2

0.3.0

HuggingFace Datasets Integration
This release integrates HuggingFace `datasets` as the core dataset management interface, removing previous custom downloaders.

What's Changed
* Refactor `Task` downloading to use `HuggingFace.datasets` by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/300
* Add templates and update docs by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/308
* Add dataset features to `TriviaQA` by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/305
* Add `SWAG` by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/306
* Fixes for using lm_eval as a library by dirkgr in https://github.com/EleutherAI/lm-evaluation-harness/pull/309
* Researcher2 by researcher2 in https://github.com/EleutherAI/lm-evaluation-harness/pull/261
* Suggested updates for the task guide by StephenHogg in https://github.com/EleutherAI/lm-evaluation-harness/pull/301
* Add pre-commit by Mistobaan in https://github.com/EleutherAI/lm-evaluation-harness/pull/317
* Decontam import fix by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/321
* Add bootstrap_iters kwarg by Muennighoff in https://github.com/EleutherAI/lm-evaluation-harness/pull/322
* Update decontamination.md by researcher2 in https://github.com/EleutherAI/lm-evaluation-harness/pull/331
* Fix key access in squad evaluation metrics by konstantinschulz in https://github.com/EleutherAI/lm-evaluation-harness/pull/333
* Fix make_disjoint_window for tail case by richhankins in https://github.com/EleutherAI/lm-evaluation-harness/pull/336
* Manually concat tokenizer revision with subfolder by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/343
* [deps] Use minimum versioning for `numexpr` by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/352
* Remove custom datasets that are in HF by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/330
* Add `TextSynth` API by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/299
* Add the original `LAMBADA` dataset by jon-tow in https://github.com/EleutherAI/lm-evaluation-harness/pull/357

New Contributors
* dirkgr made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/309
* Mistobaan made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/317
* konstantinschulz made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/333
* richhankins made their first contribution in https://github.com/EleutherAI/lm-evaluation-harness/pull/336

**Full Changelog**: https://github.com/EleutherAI/lm-evaluation-harness/compare/v0.2.0...v0.3.0

0.2.0

0.1.0

- added blimp (237)
- added qasper (264)
- added asdiv (244)
- added truthfulqa (219)
- added gsm (260)
- implemented description dict and deprecated provide_description (226)
- new `--check_integrity` flag to run integrity unit tests at eval time (290)
- positional arguments to `evaluate` and `simple_evaluate` are now deprecated
- `_CITATION` attribute on task modules (292)
- lots of bug fixes and task fixes (always remember to report task versions for comparability!)

0.0.1

Page 2 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.