What's Changed
* Generalize aggregate() output type and Remove unnecessary methods by speed1313 in https://github.com/llm-jp/llm-jp-eval-mm/pull/131
* Improve JDocQA's preparation time and Fix JMMMU scoring and Add phi4 and Refactoring by speed1313 in https://github.com/llm-jp/llm-jp-eval-mm/pull/141
* Add visualization script by speed1313 in https://github.com/llm-jp/llm-jp-eval-mm/pull/143
* Fix Heron-bench scoring and Add Asagi model by speed1313 in https://github.com/llm-jp/llm-jp-eval-mm/pull/146
**Full Changelog**: https://github.com/llm-jp/llm-jp-eval-mm/compare/v0.2.2...v0.3.0