What's Changed
* Together and HuggingFace SUTs can now return log probs in their responses when requested.
* New CLI option `--plugin-dir` loads local plugins at runtime.
* Increase reliability of downloading test data.
* Prepare modelgauge infra files for safety evaluator testing (new "System" chat role, minor `llama_guard_annotator` refactor).
* Documentation updates, including initial API reference.
* Introduce `Pipeline` and related classes to serve as the base for a composable set of objects that handle common bulk processing tasks like running prompts, getting annotations, and any other slow I/O-bound workloads.
* SafeTests use files from dev deployment of modellab.
* New `run-csv-items` command quickly runs batches of prompts and/or responses in a CSV file through some SUTs and/or annotators.
* Add new v1.0 SafeTest class and place-holder test `safe-dfm-1.0`. Version 0.5 tests (e.g. `safe-cae`) are not affected.
* Move Together plugin files + SafeTest into core modelgauge library.
New Contributors
* tsunamit made their first contribution in https://github.com/mlcommons/modelgauge/pull/449
* HuaizhengZhang made their first contribution in https://github.com/mlcommons/modelgauge/pull/489
**Full Changelog**: https://github.com/mlcommons/modelgauge/compare/v0.5.1...v0.6.0