Mteb

Latest version: v1.20.0

Safety actively analyzes 681874 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 19 of 58

1.12.54

Fix

* fix: Update Retrieval Statistics Checker (985)

* changes to retrieval stats

* lint

* Update mteb/abstasks/AbsTaskRetrieval.py

Co-authored-by: Isaac Chung <chungisaac1217gmail.com>

* make print functions nicer

* add print and tqdm

---------

Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`230c311`](https://github.com/embeddings-benchmark/mteb/commit/230c3110f5dbee38ebd45425973cb262defac3fc))

1.12.53

Fix

* fix: Add MIRACL retrieval (833)

* Adding MIRACL Retrieval (642)

* added initial MIRACL Retrieval for yoruba

* updated langs to take self.langs

* make lint

* update points

* change the explicit always remove doc_id == query_id, harms performance on miracl and other datasets

* added the nDCG10 self metric

* set ignore_identical_ids argument

* remove KoMiracl and fix langs

* add linter

* fix tests

* set back default ignore_identical_ids to True

* update retrieval test and set identical_ids to False

* Fix tests for ignore_identical_ids=True

* Update retrieval evaluator and tests

* Keep one set of metrics according to flag ignore_identical_ids

* format

* updated expected values for abstention test

* set ignore_identical_ids to true for required tasks

* Add linting

* add points

* update points

---------

Co-authored-by: Nandan Thakur <nandantgmail.com>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsengmail.com> ([`306e480`](https://github.com/embeddings-benchmark/mteb/commit/306e4807c50e49536e0ec34d052e49bab998c0b2))

Unknown

* Update tasks table ([`1a62a0b`](https://github.com/embeddings-benchmark/mteb/commit/1a62a0bccc7b055dc6cc4bd6a630793e383419cd))

* Update points table ([`6b68735`](https://github.com/embeddings-benchmark/mteb/commit/6b6873587bb56bc5783571ef29ee05762a45a142))

1.12.52

Fix

* fix: Voyage bclavie/mmarco-japanese-hard-negatives (980)

* no message

* no message

* no message

* update metadata

* change metadata

* add results

* update metadata

* update metadata

* update metadata

* update n_samples and remove line ([`06e0a8b`](https://github.com/embeddings-benchmark/mteb/commit/06e0a8b54885f8886a6ffb57af2f17ed5c1743a7))

Unknown

* Update tasks table ([`4bef147`](https://github.com/embeddings-benchmark/mteb/commit/4bef147c5d9e702414881129476b7fa2e0c68a2c))

1.12.51

Fix

* fix: ensure that results from parallel datasets are formatted correctly (974)

* fix: ensure that results from parallel datasets are formatted corrected.
Additionally updated a few results.
* add pytest coverage
* remove unfinished results file
* Add test for multilingual subset loader
* removed upper bound on numpy
* sped up tests
* add trust remote code for new datasets
---------
Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`6004ec7`](https://github.com/embeddings-benchmark/mteb/commit/6004ec7b6e99afb2d31a41784ac0b3d4a6ded935))

Unknown

* Update points table ([`ae27b5c`](https://github.com/embeddings-benchmark/mteb/commit/ae27b5c16bd2655663be28f86657569e049d7ea4))

* LLM2Vec models (926)

* adding llm2vec model loader

* fix merge

* update import error

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsengmail.com>

* llm2vec use instructions

* prompt_name separate out of kwargs

* format

* scores

* use flash attention if available

* fixed bug for retieval

* user can provide instructions for LLM2Vec

* fix type error

* making code py 3.8 and 3.9 compatible

* proper storing results

* type combination compliant with py 3.8, 3.9

* add points

* add semicolon

* updated scores

* format

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsengmail.com> ([`a6c784c`](https://github.com/embeddings-benchmark/mteb/commit/a6c784c1835ce11a9b1a974f8a6d43439d165805))

1.12.50

Fix

* fix: GritLM Retrieval instructions (981)

* Fix GritLM instructions

* Simplify

* Simplify

* Simplify

* Format

* Added e5 conditional encode for corpus/query

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsengmail.com> ([`f1c9fc7`](https://github.com/embeddings-benchmark/mteb/commit/f1c9fc775ef0edfa83e6d32e365d1e12663273b8))

Unknown

* Update tasks table ([`b9f9ea7`](https://github.com/embeddings-benchmark/mteb/commit/b9f9ea77e3080e061e9da9f08d264d952f8511f6))

* Update points table ([`4f5542a`](https://github.com/embeddings-benchmark/mteb/commit/4f5542af892de26a55c3912c971853e7c80d489c))

* Move speedtask imports (982)

* move speedtask import
* move speedtask import
* add points file
* Update docs/mmteb/points/982.jsonl
Co-authored-by: Isaac Chung <chungisaac1217gmail.com>
---------
Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`58c7866`](https://github.com/embeddings-benchmark/mteb/commit/58c78667e4ec4e1d6add18ca2fd57fb01534a5e2))

1.12.49

Fix

* fix: add max_fraction_of_documents_to_embed to clustering datasets with `max_document_to_embed` (977)

* add max_fraction_of_documents_to_embed
* add points file
* add tests for clusteringfast
* review points
---------
Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`5367193`](https://github.com/embeddings-benchmark/mteb/commit/536719374e0e179fb1354f02e0a511d7ceefaa70))

Unknown

* Update points table ([`44af47a`](https://github.com/embeddings-benchmark/mteb/commit/44af47a04c953c4c1cdb627be4ed29609bce1f73))

* Update points table ([`e169fb3`](https://github.com/embeddings-benchmark/mteb/commit/e169fb30e4f2cb01cdd1fa0667a325a6a73b5c01))

* add points (979)

* add points for previous PRs.
* Update docs/mmteb/points/906.jsonl
Co-authored-by: Isaac Chung <chungisaac1217gmail.com>
* Update docs/mmteb/points/873.jsonl
Co-authored-by: Isaac Chung <chungisaac1217gmail.com>
---------
Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`1c8447d`](https://github.com/embeddings-benchmark/mteb/commit/1c8447dd1e9ad41748ed7bdaacf36cafa6a1754d))

* Fix GritLM instructions (976)

* Fix GritLM instructions

* make lint

---------

Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`a3410e9`](https://github.com/embeddings-benchmark/mteb/commit/a3410e919c1987521f5eb4a54ed7464fd44875e6))

* Update points table ([`8b6b08e`](https://github.com/embeddings-benchmark/mteb/commit/8b6b08e699fe74b0de535f847d87a3a733f977b5))

* Improve belebele retrieval task (894)

* Multilingual/BelebeleRetrieval: Fix self.langs to self.hf_subsets
* Multilingual/BelebeleRetrieval: Use lang+script code
* Multilingual/BelebeleRetrieval: Use 7 more scripts
* Multilingual/BelebeleRetrieval: convert to cross lingual task
* Multilingual/BelebeleRetrieval: add sample results
* Multilingual/BelebeleRetrieval: update historic results language codes
* Multilingual/BelebeleRetrieval: code formattin
* Multilingual/BelebeleRetrieval: update results
* Multilingual/BelebeleCrossRetrieval: Creating new version of Belebele task
* Multilingual/BelebeleCrossRetrieval: Restricting langauge pairs
* Multilingual/BelebeleCrossRetrieval: Formatting changes
* Multilingual/BelebeleRetrieval: Replace old task
* Multilingual/BelebeleRetrieval: Use underscore between language-script in language pair
* Add points
* points
---------
Co-authored-by: Isaac Chung <chungisaac1217gmail.com> ([`3137f96`](https://github.com/embeddings-benchmark/mteb/commit/3137f96008acd07e1db438ce750b2cead45579cd))

Page 19 of 58

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.