Mteb

Latest version: v1.36.35

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 54 of 85

1.12.4

Fix

* fix: add model meta to create reproducible workflow (807)

* replace get_tasks as default filtering.

The intention here is to:

1) move complexity away from the MTEB object
2) ensure that the filters are applied in the same way across the benchmark (currently MTEB filters slightly differently due to not handling the new language codes)
3) deprecate filtering in MTEB going forward (only with a warning atm.)
4) doing it in a two step fashion ensure that users are able to inspect the tasks before they are run (also allow for much more custom filtering on the user end)

* add model meta to create reproducible workflow

- Add outline for model meta object
- Added a single model as a an example
- Added test for the reproducible workflow

The intention is that a reproducible workflow should then look like:


assuming the same mteb and sent. trf. version

model_meta = mteb.get_model(model_name)
task = mteb.get_task(task_name)

model = model_meta.load_model() load model either using custom loader or sentence transformer (with revision)

eval = MTEB(tasks=[task])
eval.run(model, output_folder=&34;tests/results&34;, overwrite_results=True)


For running models we can the simply have tasks like:

1) implement model
2) ensures that it runs on all tasks types

Running the models then become simple:


eval = MTEB(tasks=mteb.get_tasks())
for mdl_name in models:
model_meta = mteb.get_model(mdl_name)
mdl = model_meta.load_model()
eval.run(mteb.get_model(mdl)


We can start with this already now e.g. on classification tasks.

* import ISO_LANGUAGE from languages

* fix import

* Apply suggestions from code review

Co-authored-by: Isaac Chung <chungisaac1217gmail.com>

* format

* Apply suggestions from code review

Co-authored-by: Isaac Chung <chungisaac1217gmail.com>

* Updated based on suggestions from review

---------

Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`0319105`](https://github.com/embeddings-benchmark/mteb/commit/0319105734444de0626c068a3284832d96233dac))

* fix: Updated CLI to use new task filter (826)

* replace get_tasks as default filtering.

The intention here is to:

1) move complexity away from the MTEB object
2) ensure that the filters are applied in the same way across the benchmark (currently MTEB filters slightly differently due to not handling the new language codes)
3) deprecate filtering in MTEB going forward (only with a warning atm.)
4) doing it in a two step fashion ensure that users are able to inspect the tasks before they are run (also allow for much more custom filtering on the user end)

* tests passing

* Added corrections from review

* Updated CLI

* docs: Added points

---------

Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`fb5fec8`](https://github.com/embeddings-benchmark/mteb/commit/fb5fec8b763c107fbcc9bdc853a64d6d8a8d0043))

Unknown

* Update points table ([`f926216`](https://github.com/embeddings-benchmark/mteb/commit/f926216f8427ee514196d200caa089a16a22db48))

* Update tasks table ([`d560c31`](https://github.com/embeddings-benchmark/mteb/commit/d560c31d3d10d6e13cf41d4f3aabff9ef4d37cec))

* Update points table ([`84e6856`](https://github.com/embeddings-benchmark/mteb/commit/84e6856cfb9f251c102a77394fe26eaaa1c01624))

* Add MLQuestions dataset (799)

* mlquestions load script

* more metadata

* add to init

* baseline model results

* add points

* complete metadata

* lint

* Update points and metadata

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsengmail.com>

* clarification of period in comments

* minor fix

* linting

* Fix validation error

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsengmail.com> ([`3a14885`](https://github.com/embeddings-benchmark/mteb/commit/3a14885b8ea0406f9ae0edf7b550bfadb37fcb4e))

1.12.3

Fix

* fix: Convert blurbs to fast for s2s and p2p (832)

* add BlurbsClusteringS2S.v2

* add BlurbsClusteringP2P.v

* make lint

* points

* update n_samples ([`a00fdba`](https://github.com/embeddings-benchmark/mteb/commit/a00fdba44d1fadd0be449b7f988dd08145cd5b87))

Unknown

* Update tasks table ([`04d5494`](https://github.com/embeddings-benchmark/mteb/commit/04d549462482f96066b4a10f828094f3da076d92))

* Update points table ([`4b31692`](https://github.com/embeddings-benchmark/mteb/commit/4b316922499d927672a518f1d44dbf8196ea547e))

1.12.2

Fix

* fix: (1) Add `StatcanDialogueDatasetRetrieval` (2) Fix `DRESModel.encode_conversations` to allow list of dictionaries (779)

* WIP

* Update metadata based on reviewer requested changes

* Finalize metadata

* add statcandialoguedatasetretrival to mteb.tasks.retrieval

* Convert query from JSON string to dictionary (parsed using `json`)

* Fix: DRESModel to allow conversations composed of list of dictionaries, alongside lists of strings (e.g. Topicoqa). Also fix batch_size parameter in encode_conversations

* Add baseline results

* Change revision to hash

* Add points

* Fix incorrect object reference ([`7943ff0`](https://github.com/embeddings-benchmark/mteb/commit/7943ff05d4ec4d4c4da81a5a6c50fd298de907dd))

Unknown

* Update tasks table ([`04f6cdc`](https://github.com/embeddings-benchmark/mteb/commit/04f6cdc8022a1514a55ce20fb38cd11431c11842))

* Update points table ([`3a96c69`](https://github.com/embeddings-benchmark/mteb/commit/3a96c6955e29c0a7a9d7e61e25f89fdd15269d7c))

* Update points table ([`6563218`](https://github.com/embeddings-benchmark/mteb/commit/656321807a3bf132753920d6f076266925613d6f))

* Add fast clustering version to existing files (827)

* Add fast clustering to existing file

* add points ([`a0ebf87`](https://github.com/embeddings-benchmark/mteb/commit/a0ebf87da18c6e94b27b855f136dc3a13a0d7a5d))

1.12.1

Fix

* fix: rename *fast to *v2 for clustering task (825)

* rename *fast to *v2 for clustering task

* added correction from review ([`365c0d4`](https://github.com/embeddings-benchmark/mteb/commit/365c0d4c67162c9d3a50fcd5c38d3dae37a96eac))

Unknown

* Update tasks table ([`e3641ea`](https://github.com/embeddings-benchmark/mteb/commit/e3641ea91c677c44851f2e7fe2680dc495fbc54f))

1.12.0

Feature

* feat: replace get_tasks as default filtering. (806)

* replace get_tasks as default filtering.

The intention here is to:

1) move complexity away from the MTEB object
2) ensure that the filters are applied in the same way across the benchmark (currently MTEB filters slightly differently due to not handling the new language codes)
3) deprecate filtering in MTEB going forward (only with a warning atm.)
4) doing it in a two step fashion ensure that users are able to inspect the tasks before they are run (also allow for much more custom filtering on the user end)

* tests passing

* Added corrections from review

---------

Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`0ca3bc1`](https://github.com/embeddings-benchmark/mteb/commit/0ca3bc1219981af0c901b4918e78658453f3949b))

1.11.19

Fix

* fix: Convert AlloProfClustering to Fast (822)

* Convert AlloProfClustering to Fast
* Apply suggestions from code review
---------
Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`4cae4f9`](https://github.com/embeddings-benchmark/mteb/commit/4cae4f98a759bed66d83ba54de54ac79439acfcc))

Unknown

* Update tasks table ([`07866a5`](https://github.com/embeddings-benchmark/mteb/commit/07866a5d0eed038c12df735f97e03c68b6b38e51))

* Update points table ([`c41f0e0`](https://github.com/embeddings-benchmark/mteb/commit/c41f0e01dadcf4808748b67eb0ab228cca5f0a52))

Page 54 of 85

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.