Vidore-benchmark

Latest version: v5.0.0

Safety actively analyzes 723717 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

5.0.0

Added

- Add CLI eval support for ColQwen2, DSEQwen2, Cohere, ColIdefics3 API embedding models
- Add Pydantic models for storing the ViDoRe benchmark results and metadata (includes `vidore-benchmark` version)
- Add option to create an `EvalManager` instance from `ViDoReBenchmarkResults`
- Add `num_workers` argument when using dataloaders
- Allow the creation of a `VisionRetriever` instance using a PyTorch model and a processor that implements a `process_images` and a `process_queries` methods, similarly to the ColVision processors
- Add `dataloader_prebatch_query` and `dataloader_prebatch_passage` arguments to avoid loading the entire datasets in memory (used to cause RAM spikes when loading large image datasets)
- Add QA-to-BEIR dataset format conversion script
- Add support for the BEIR dataset format with `ViDoReEvaluatorBEIR`

Changed

- [Breaking] Change the CLI argument names
- Add option to load a specific checkpoint for Hf models with `pretrained_model_name_or_path`
- Improve soft dependency handling in retriever classes (the `colpali-engine` is now optional)
- [Breaking] Change the `get_scores` signature
- [Breaking] Rename `forward_documents` to `forward_passages` to match the literature and reduce confusion
- [Breaking] Rename `DSERetriever` into `DSEQwen2Retriever`
- [Breaking] Rename args in the CLI script
- When available, use `processor.get_scores` instead of custom scoring snippet
- [Breaking] Rename `ColQwenRetriever` to `ColQwen2Retriever`
- [Breaking] Rename `BiQwenRetriever` to `BiQwen2Retriever`
- [Breaking] Revamp the `evaluate` module. Evaluation is now handled by the `ViDoReEvaluatorQA` class
- [Breaking] Rename `ViDoReEvaluator` into `BaseViDoReEvaluator`. The new `ViDoReEvaluator` class allows to create retrievers using the Python API.
- Set default `num_workers` to 0 in retrievers
- Update default checkpoints for ColPali and ColQwen2 retrievers

Fixed

- Fix `evaluate_dataset` when used with the BM25 retriever
- Fix issue when no `pretrained_model_name_or_path = None` in `load_vision_retriever_from_registry`
- Fix `DummyRetriever`'s `get_scores` method
- Fix processor output not being sent to the correct device in `ColQwen2Retriever`
- Fix bugs in `BiQwen2Retriever`
- Fix try-catch block for soft dep check in `SigLIPRetriever`

Removed

- Remove experimental quantization module
- Remove the `interpretability` module. The interpretability code has been moved and improved as part of the [`colpali-engine==0.3.2`](https://github.com/illuin-tech/colpali/releases/tag/v0.3.2) release.
- [Breaking] Remove support for token pooling. This feature will be re-introduced in `colpali-engine>0.3.9`
- Replace `loguru` with built-in `logging` module
- Remove the `retrieve_on_dataset` and `retrieve_on_pdfs` entrypoint scripts
- Remove the `pdf_utils` module
- Remove the `get_top_k` method from the `evaluate` module
- Remove the `plot_utils` and `test_utils` modules
- Remove the `experiments` directory

Tests

- Add tests for all built-in vision retrievers
- Add fixtures in retriever tests to speed up testing
- Add tests for `ViDoReBenchmarkResults`
- Add tests for `EvalManager`
- Add tests and E2E tests for cli command `evaluate-retriever`
- Add tests for `ViDoReEvaluatorBEIR`

4.0.2

Deprecated

- Deprecate the `interpretability` module

Build

- Fix and update conflicts for package dependencies

4.0.1

- Rename ColPali model alias to match model name (use `--model-name vidore/colpali` instead of `--model-name vidore/colpali-v1.2` with the `vidore-benchmark evaluate-retriever` CLI)
- Use the ColPali model name to load ColPaliProcessor instead of the PaliGemma one

Fixed

- Add missing `model.eval()` to all vision retrievers to make results deterministic

4.0.0

Added

- Add "Ruff" and "Test" CI pipelines
- Add upper bound for `colpali-engine` to prevent eventual breaking changes

Changed

- Remove unused deps from `pyproject`
- Clean `pyproject`
- Bump `colpali-engine` to `v0.3.0` and adapt code for the new API
- Replace black with ruff linter
- Add better ColPali model loading

Removed

- Remove duplicate code with `colpali-engine` (e.g. remove `ColPaliProcessor`, `ColPaliScorer`...)

Fixed

- Change typing to support Python 3.9
- Fix the `generate_similarity_maps` CLI
- Various fixes

3.4.2

- Fix typo when making `model_name` configurable in previous release
- Fix wrong image processing for ColPali

3.4.1

Changed

- Make `model_name` configurable in `ColPaliRetriever` as optional arg
- Tweak `EvalManager`
- Tweak `ColPaliProcessor`
- Improve tests

Page 1 of 3

Releases

Has known vulnerabilities

Vidore-benchmark

Page 1 of 3

5.0.0

4.0.2

4.0.1

4.0.0

3.4.2

3.4.1

Page 1 of 3

Links

Releases