Bigcodebench

Latest version: v0.1.9

Safety actively analyzes 666166 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.1.8

Features:
- Support `BigCodeBench-Hard` subset: https://github.com/bigcode-project/bigcodebench/pull/17
- Identify and fix tokenizer setup: https://github.com/bigcode-project/bigcodebench/issues/21
- Customize the tokenizer: https://github.com/bigcode-project/bigcodebench/pull/20
- Add the pass rate result log: https://github.com/bigcode-project/bigcodebench/pull/20

Contributors:
- marianna13: https://github.com/bigcode-project/bigcodebench/pull/20

Models:
- A total of 96 models at the time of the release

Acknowledgement:
- ethanc8
- takkyu2
- imamnurby

**Full Changelog**: https://github.com/bigcode-project/bigcodebench/compare/v0.1.7...v0.1.8

0.1.7.post2

- Enhanced the calculation of ground truth pass rate, and addressed the issue mentioned in https://github.com/bigcode-project/bigcodebench/pull/12#issuecomment-2199186199.
- Update the README docs.

0.1.7

Fix some identified issues:
- The ground truth pass rate was not previously computed in the correct way.
- Passed RAM limits would raise errors, as they were set as float type.
- User permission is not correctly set up in the Evaluate Docker.

Features:
-- `check-gt-only` will print out the pass rate when finishing.

0.1.6

New features;

- The RAM setup is now adjustable via specific arguments.
- Parallel ground truth checking is supported. Potentially failed checks are skipped during execution. A warning will be issued if the ground truth pass rate falls below 0.95.

0.1.5

New features;

- The data is downloaded from HF hub by default.
- Data formats have been unified for the one on HF and the one on GitHub.

0.1.2

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.