Bigcodebench

Latest version: v0.2.4

Safety actively analyzes 715033 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 2

0.1.7.post2

- Enhanced the calculation of ground truth pass rate, and addressed the issue mentioned in https://github.com/bigcode-project/bigcodebench/pull/12#issuecomment-2199186199.
- Update the README docs.

0.1.7

Fix some identified issues:
- The ground truth pass rate was not previously computed in the correct way.
- Passed RAM limits would raise errors, as they were set as float type.
- User permission is not correctly set up in the Evaluate Docker.

Features:
-- `check-gt-only` will print out the pass rate when finishing.

0.1.6

New features;

- The RAM setup is now adjustable via specific arguments.
- Parallel ground truth checking is supported. Potentially failed checks are skipped during execution. A warning will be issued if the ground truth pass rate falls below 0.95.

0.1.5

New features;

- The data is downloaded from HF hub by default.
- Data formats have been unified for the one on HF and the one on GitHub.

0.1.2

Page 2 of 2

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.