Mteb

Latest version: v1.34.26

Safety actively analyzes 707758 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 76 of 77

1.3.2

Documentation

* docs: Update links in README.md (296) ([`76056b5`](https://github.com/embeddings-benchmark/mteb/commit/76056b5ba92dcbfe32d629897ab6d5db3a0861c4))

Fix

* fix: Added tasks from SEB (287)

* Added tasks from SEB

* docs: fix link

* fix: ran linting

* fix typing for 3.8

* fixed annotation for v3.8 ([`39cff49`](https://github.com/embeddings-benchmark/mteb/commit/39cff490157ae87d1cf62c77022f325be729bf04))

1.3.1

Fix

* fix: updated version in transition to semantic release ci ([`238ab82`](https://github.com/embeddings-benchmark/mteb/commit/238ab825e9b221c363589eed89273481e058c50f))

1.2.0

Updates
* πŸ‡ͺπŸ‡Έ New Spanish datasets thanks to violenil & team πŸš€
* πŸ‡«πŸ‡· New French datasets thanks to GabrielSequeira & team + there's a [new French Overall leaderboard tab](https://huggingface.co/spaces/mteb/leaderboard) thanks to their massive benchmarking πŸ₯‡
* Retrieval has become much simpler and is now standardized to align with other tasks. You can [inspect all Retrieval datasets on the hub](https://huggingface.co/mteb), it is much easier to add new Retrieval datasets now & there are fewer dependencies making installing MTEB easier 😊 While this change is backward-compatible, it represents a significant change in how MTEB works, thus we decided to increment the minor for this release (1.1.2 -> 1.2.0).

What's Changed
* Add tasks for Spanish Embedding Evaluation by violenil in https://github.com/embeddings-benchmark/mteb/pull/227
* Extend MTEB with French datasets by GabrielSequeira in https://github.com/embeddings-benchmark/mteb/pull/218
* Remove HAGRID from french benchmark by MathieuCiancone in https://github.com/embeddings-benchmark/mteb/pull/235
* Fixed missing revision error on Norwegian Bitext Mining by x-tabdeveloping in https://github.com/embeddings-benchmark/mteb/pull/221
* Simplify retrieval by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/233

New Contributors
* GabrielSequeira made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/218
* MathieuCiancone made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/235

**Full Changelog**: https://github.com/embeddings-benchmark/mteb/compare/1.1.2...1.2.0

1.1.2

What's Changed
* fix RerankingEvaluator's compute_metrics_individual by novak2000 in https://github.com/embeddings-benchmark/mteb/pull/165
* Add Long Document Evaluation Datasets by violenil in https://github.com/embeddings-benchmark/mteb/pull/166
* Fix medrxiv mislinkage by zhimin-z in https://github.com/embeddings-benchmark/mteb/pull/187
* Fix Dalaj linkage by zhimin-z in https://github.com/embeddings-benchmark/mteb/pull/195
* Fix SummEval linkage by zhimin-z in https://github.com/embeddings-benchmark/mteb/pull/194
* Fix SweFAQ linkage by zhimin-z in https://github.com/embeddings-benchmark/mteb/pull/193
* Added Norwegian BokmΓ₯l-Nynorsk bitext mining task by x-tabdeveloping in https://github.com/embeddings-benchmark/mteb/pull/202
* Add support for cache results by hongjin-su in https://github.com/embeddings-benchmark/mteb/pull/207
* Retrieval benchmark based on GermanQuAD by rasdani in https://github.com/embeddings-benchmark/mteb/pull/197
* Refer to other works by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/212
* Fix selection of DRES/DRPES by Markus28 in https://github.com/embeddings-benchmark/mteb/pull/179
* Add tasks for German Embedding Evaluation by guenthermi in https://github.com/embeddings-benchmark/mteb/pull/214
* only save top-k by hongjin-su in https://github.com/embeddings-benchmark/mteb/pull/209
* Add MultiLongDocRetrieval task to MTEB. by hanhainebula in https://github.com/embeddings-benchmark/mteb/pull/224
* Add Korean Text Search Tasks to MTEB by taeminlee in https://github.com/embeddings-benchmark/mteb/pull/210
* Update BeIRPLTask.py by kwojtasi in https://github.com/embeddings-benchmark/mteb/pull/225
* Add task list by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/228

New Contributors
* novak2000 made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/165
* violenil made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/166
* zhimin-z made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/187
* x-tabdeveloping made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/202
* hongjin-su made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/207
* rasdani made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/197
* Markus28 made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/179
* hanhainebula made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/224
* taeminlee made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/210

**Full Changelog**: https://github.com/embeddings-benchmark/mteb/compare/1.1.1...1.1.2

1.1.1

Updates

- πŸ‡¨πŸ‡³ C-MTEB was released and integrated thanks to staoxiao. Check out the paper [here](https://arxiv.org/pdf/2309.07597.pdf). Together with C-MTEB, the team also released other great embedding resources such as new SoTA models on MTEB & C-MTEB called BGE, as well as datasets and source code πŸš€
- πŸ‡΅πŸ‡± PL-MTEB & BEIR-PL was released and integrated thanks to rafalposwiata & kwojtasi. Check out the new leaderboard tab for PL-MTEB: https://huggingface.co/spaces/mteb/leaderboard. Some BEIR-PL datasets are still missing and will be added soon cc kwojtasi πŸ˜‡
- πŸ’» Clarifications on multi-GPU: Native multi-GPU support for Retrieval thanks to NouamaneTazi. We also added a clarification in the README on how any task can be run in a multi-GPU setup without requiring any changes in MTEB. MTEB abstracts the way the encodings are produced. Whether users use multiple or a single GPU in the `encode` function is completely flexible 😊

What's Changed
* Code cleanup by NouamaneTazi in https://github.com/embeddings-benchmark/mteb/pull/131
* Replaced prints with logging by KennethEnevoldsen in https://github.com/embeddings-benchmark/mteb/pull/133
* Add BEIR-PL datasets to MTEB by kwojtasi in https://github.com/embeddings-benchmark/mteb/pull/121
* Add Polish tasks (PL-MTEB) by rafalposwiata in https://github.com/embeddings-benchmark/mteb/pull/137
* Add Chinese tasks (C-MTEB) by staoxiao in https://github.com/embeddings-benchmark/mteb/pull/134
* Support Multi-node Evaluation by NouamaneTazi in https://github.com/embeddings-benchmark/mteb/pull/132
* Add multi gpu eval to readme by NouamaneTazi in https://github.com/embeddings-benchmark/mteb/pull/140
* Default to false by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/143
* Rely on standard encode kwargs only by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/145
* Fix splits by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/149
* fix: add missing task-langs attribute by guenthermi in https://github.com/embeddings-benchmark/mteb/pull/152
* Clarify multi-gpu usage by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/153
* Simplify code snippets by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/154
* fix: msmarco-v2 uses dev.tsv, not dev1.tsv by garrett361 in https://github.com/embeddings-benchmark/mteb/pull/155
* Fix eval langs by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/157

New Contributors
* kwojtasi made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/121
* rafalposwiata made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/137
* staoxiao made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/134
* guenthermi made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/152
* garrett361 made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/155

**Full Changelog**: https://github.com/embeddings-benchmark/mteb/compare/1.1.0...1.1.1

1.1.0

Updates
- πŸ‡©πŸ‡°πŸ‡³πŸ‡΄πŸ‡ΈπŸ‡ͺ New Danish, Norwegian and Swedish BitextMining & Classification tasks `AngryTweetsClassification`, `BornholmBitextMining`, `DKHateClassification`, `DalajClassification`, `LccSentimentClassification`, `NordicLangClassification`, `NorwegianParliament`, `ScalaDaClassification`, `ScalaNbClassification` & `ScalaSvClassification` thanks to KennethEnevoldsen
- πŸ‡©πŸ‡ͺ New German Clustering tasks `BlurbsClusteringP2P`, `BlurbsClusteringS2S`, `TenKGnadClusteringP2P` & `TenKGnadClusteringS2S` thanks to slvnwhrl
- ❉ Change in cluster initialization from `3` to the sklearn recommended default of `auto`. This leads to tiny changes in clustering scores going forward and hence makes this release not backwards-compatible. See [here](https://github.com/embeddings-benchmark/mteb/pull/104#discussion_r1162880322) for a discussion. Thanks to stephantul for this change.
- ❌ Errors are now directly raised by default. This behavior can be deactivated by passing a kwarg at evaluation. Previously, they were just written to a `.txt file`. Thanks to KennethEnevoldsen for introducing this change.
- πŸ’» Code cleanups thanks to stephantul izhx permutohedra
- πŸ“ˆ The [leaderboard](https://huggingface.co/spaces/mteb/leaderboard) has also improved a lot with new task-based rankings, better caching and many new models

What's Changed
* Fix kNN Multiclass by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/92
* Fix SemmEval description by ahoho in https://github.com/embeddings-benchmark/mteb/pull/97
* Make inputs always List[str] & call in one by Muennighoff in https://github.com/embeddings-benchmark/mteb/pull/99
* Fix clustering warning by stephantul in https://github.com/embeddings-benchmark/mteb/pull/104
* Fix the extending of language pairs in `MTEB` by izhx in https://github.com/embeddings-benchmark/mteb/pull/106
* Add property annotation to description method of AbsTask by permutohedra in https://github.com/embeddings-benchmark/mteb/pull/111
* Add German clustering datasets by slvnwhrl in https://github.com/embeddings-benchmark/mteb/pull/116
* Added support for Scandinavian Languages by KennethEnevoldsen in https://github.com/embeddings-benchmark/mteb/pull/124
* Bump version ID and update PyPI by KennethEnevoldsen in https://github.com/embeddings-benchmark/mteb/pull/128

New Contributors
* ahoho made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/97
* stephantul made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/104
* izhx made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/106
* permutohedra made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/111
* slvnwhrl made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/116
* KennethEnevoldsen made their first contribution in https://github.com/embeddings-benchmark/mteb/pull/124

**Full Changelog**: https://github.com/embeddings-benchmark/mteb/compare/1.0.1...1.1.0

Page 76 of 77

Links

Releases

Has known vulnerabilities

Β© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.