Feature
* feat: Integrating ChemTEB (1708)
* Add SMILES, AI Paraphrase and Inter-Source Paragraphs PairClassification Tasks
* Add chemical subsets of NQ and HotpotQA datasets as Retrieval tasks
* Add PubChem Synonyms PairClassification task
* Update task __init__ for previously added tasks
* Add nomic-bert loader
* Add a script to run the evaluation pipeline for chemical-related tasks
* Add 15 Wikipedia article classification tasks
* Add PairClassification and BitextMining tasks for Coconut SMILES
* Fix naming of some Classification and PairClassification tasks
* Fix some classification tasks naming issues
* Integrate WANDB with benchmarking script
* Update .gitignore
* Fix `nomic_models.py` issue with retrieval tasks, similar to issue 1115 in original repo
* Add one chemical model and some SentenceTransformer models
* Fix a naming issue for SentenceTransformer models
* Add OpenAI, bge-m3 and matscibert models
* Add PubChem SMILES Bitext Mining tasks
* Change metric namings to be more descriptive
* Add English e5 and bge v1 models, all the sizes
* Add two Wikipedia Clustering tasks
* Add a try-except in evaluation script to skip faulty models during the benchmark.
* Add bge v1.5 models and clustering score extraction to json parser
* Add Amazon Titan embedding models
* Add Cohere Bedrock models
* Add two SDS Classification tasks
* Add SDS Classification tasks to classification init and chem_eval
* Add a retrieval dataset, update dataset names and revisions
* Update revision for the CoconutRetrieval dataset: handle duplicate SMILES (documents)
* Update `CoconutSMILES2FormulaPC` task
* Change CoconutRetrieval dataset to a smaller one
* Update some models
- Integrate models added in ChemTEB (such as amazon, cohere bedrock and nomic bert) with latest modeling format in mteb.
- Update the metadata for the mentioned models
* Fix a typo
`open_weights` argument is repeated twice
* Update ChemTEB tasks
- Rename some tasks for better readability.
- Merge some BitextMining and PairClassification tasks into a single task with subsets (`PubChemSMILESBitextMining` and `PubChemSMILESPC`)
- Add a new multilingual task (`PubChemWikiPairClassification`) consisting of 12 languages.
- Update dataset paths, revisions and metadata for most tasks.
- Add a `Chemistry` domain to `TaskMetadata`
* Remove unnecessary files and tasks for MTEB
* Update some ChemTEB tasks
- Move `PubChemSMILESBitextMining` to `eng` folder
- Add citations for tasks involving SDS, NQ, Hotpot, PubChem data
- Update Clustering tasks `category`
- Change `main_score` for `PubChemAISentenceParaphrasePC`
* Create ChemTEB benchmark
* Remove `CoconutRetrieval`
* Update tasks and benchmarks tables with ChemTEB
* Mention ChemTEB in readme
* Fix some issues, update task metadata, lint
- `eval_langs` fixed
- Dataset path was fixed for two datasets
- Metadata was completed for all tasks, mainly following fields: `date`, `task_subtypes`, `dialect`, `sample_creation`
- ruff lint
- rename `nomic_bert_models.py` to `nomic_bert_model.py` and update it.
* Remove `nomic_bert_model.py` as it is now compatible with SentenceTransformer.
* Remove `WikipediaAIParagraphsParaphrasePC` task due to being trivial.
* Merge `amazon_models` and `cohere_bedrock_models.py` into `bedrock_models.py`
* Remove unnecessary `load_data` for some tasks.
* Update `bedrock_models.py`, `openai_models.py` and two dataset revisions
- Text should be truncated for amazon text embedding models.
- `text-embedding-ada-002` returns null embeddings for some inputs with 8192 tokens.
- Two datasets are updated, dropping very long samples (len > 99th percentile)
* Add a layer of dynamic truncation for amazon models in `bedrock_models.py`
* Replace `metadata_dict` with `self.metadata` in `PubChemSMILESPC.py`
* fix model meta for bedrock models
* Add reference comment to original Cohere API implementation ([`4d66434`](https://github.com/embeddings-benchmark/mteb/commit/4d66434c80050ace3b927f3fc1829b8dd377f78a))
Unknown
* Update points table ([`223bf32`](https://github.com/embeddings-benchmark/mteb/commit/223bf324c213f222785bbf2db88e30c8069c610b))