Parsbench

Latest version: v0.1.7

Safety actively analyzes 681812 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.1.7

Fixed

- Fix typos in prompt templates.
- Fix error on using formatted targets while scoring matches.
- Improve `from_matches_files` function speed in BenchmarkResult.
- Fix returning list in AnthropicModel completion function.

Added

- Add `formatted_completion` field to task matches.
- Update completion formatters in tasks.
- Add re-score option to `from_matches_files` in BenchmarkResult.
- Add max retries exceeded error in API-based models.
- Add snapshot functionality to save matches on error.
- Add leaderboard builder function.
- Add build from file functions to the BenchmarkResult class.

0.1.6

Fixed

- Fix misspell in ParsiNLUMultipleChoice task name.
- Fix wrong target key in the XLSummary.
- Add org prefix to the sentiment analysis task.
- Fix FarsTailEntailment prompt target key.

Added

- Add `attention_mask` to the transformer model `generate` function.

0.1.5

Added

- Add FarsTail entailment task.
- Add Persian News Summary task.
- Add XL-Sum task.
- Add ParsiNLUBenchmark. It is a sub class of CustomBenchmark with hard-coded ParsiNLU tasks.

0.1.4

Fixed

- Fix `load_all_tasks` returning empty list.

Added

- Add Anthropic model interface.
- Add retry on rate limit to API-based models.
- Add `skip_existing_matches` to the task evaluate function. It skips matches that are already generated and scored.

0.1.3

Fixed

- Fix `model_name` property in PreTrainedTransformerModel.

Added

- Add Persian MMLU (Khayyam Challenge) task.
- Add `select_sub_tasks` to the task class.

0.1.2

Fixed

- Fix misspells and typos.
- Use`name_or_path` parameter as the `model_name` in PreTrainedTransformerModel.

Changed

- Update sentiment analysis task prompt template.

Added

- Add `completion_formatter` to the model interfaces.

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.