The `similarity_fn_name` can now be specified via the [`SentenceTransformer`](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer) like so:
python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/multi-qa-mpnet-base-dot-v1", similarity_fn_name="dot")
Valid options include "cosine" (default), "dot", "euclidean", "manhattan". The chosen `similarity_fn_name` will also be saved into the model configuration, and loaded automatically. For example, the [`msmarco-distilbert-dot-v5`](https://huggingface.co/sentence-transformers/msmarco-distilbert-dot-v5) model was trained to work best with `dot`, so we've configured it to use that `similarity_fn_name` in its [configuration](https://huggingface.co/sentence-transformers/msmarco-distilbert-dot-v5/blob/main/config_sentence_transformers.json#L9):
python
>>> from sentence_transformers import SentenceTransformer
>>> model = SentenceTransformer("sentence-transformers/msmarco-distilbert-dot-v5")
>>> model.similarity_fn_name
'dot'
- Docs: [Semantic Textual Similarity > Similarity Calculation](https://sbert.net/docs/sentence_transformer/usage/semantic_textual_similarity.html)
Big thanks to ir2718 for helping set up this major feature.
Allow passing `model_kwargs`, `tokenizer_kwargs`, and `config_kwargs` to `SentenceTransformer` (2578)
To those familiar with the internals of Sentence Transformers, you might know that internally, we call [`AutoModel.from_pretrained`](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel.from_pretrained), [`AutoTokenizer.from_pretrained`](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoTokenizer.from_pretrained) and [`AutoConfig.from_pretrained`](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoConfig.from_pretrained) from `transformers`.
Each of these are rather powerful, and they are constantly improved with new features. For example, the `AutoModel` keyword arguments include:
* [`torch_dtype`](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel.from_pretrained.torch_dtype) - this allows you to immediately load a model in `bfloat16` or `float16` (or `"auto"`, i.e. whatever the model was stored in), which can speed up inference a lot.
* [`quantization_config`](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel.from_pretrained.quantization_config)
* [`attn_implementation`](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel.from_pretrained.attn_implementation) - all models support "eager", but some also support the much faster "fa2" (Flash Attention 2) and "sdpa" (Scaled Dot Product Attention).
These options allow for speeding up the model inference. Additionally, via `AutoConfig` you can update the model configuration, e.g. updating the dropout probability during training, and with `AutoTokenizer` you can disable the fast Rust-based tokenizer if you're having issues with it via `use_fast=False`.
Due to how useful these options can be, the following arguments are added to `SentenceTransformer`:
* `model_kwargs` for `AutoModel.from_pretrained` keyword arguments
* `tokenizer_kwargs` for `AutoTokenizer.from_pretrained` keyword arguments
* `config_kwargs` for `AutoConfig.from_pretrained` keyword arguments
You can use it like so:
python
from sentence_transformers import SentenceTransformer
import torch
model = SentenceTransformer(
"mixedbread-ai/mxbai-embed-large-v1",
model_kwargs={"torch_dtype": torch.bfloat16, "attn_implementation": "sdpa"},
config_kwargs={"hidden_dropout_prob": 0.3},
)
embeddings = model.encode(["He drove his yellow car to the beach.", "He played football with his friends."])
print(embeddings.shape)
Big thanks to satyamk7054 for starting this work.
Hyperparameter Optimization (2655)
Sentence Transformers v3.0 introduces Hyperparameter Optimization (HPO) by extending the `transformers` HPO support. We recommend reading the all new [Hyperparameter Optimization](https://sbert.net/examples/training/hpo/README.html) for many more details.
Datasets Release
Alongside Sentence Transformers v3.0, we reformat and release 50+ useful datasets in our [Embedding Model Datasets](https://huggingface.co/collections/sentence-transformers/embedding-model-datasets-6644d7a3673a511914aa7552) Collection on Hugging Face. These can be used with at least one loss function in Sentence Transformers v3.0 out of the box. We recommend browsing through these to see if there are datasets akin to your use cases - training a model on them might just produce large gains on your task(s).
MSELoss extension (2641)
The [MSELoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss) now accepts multiple text columns for each label (where each label is a target/gold embedding), rather than only accepting one text column. This is extremely powerful for following the excellent [Multilingual Models](https://sbert.net/examples/training/multilingual/README.html) strategy to convert a monolingual model into a multilingual one. You can now conveniently train both English and (identical but translated) non-English texts to represent the same embedding (that was generated by a powerful English embedding model).
Add `local_files_only` argument to SentenceTransformer & CrossEncoder (2603)
You can now initialize a `SentenceTransformer` and `CrossEncoder` with `local_files_only`. If `True`, then it will not try and download a model from Hugging Face, it will only look in the local filesystem for the model or try and load it from a cache.
Thanks debanjum for this change.
All changes
* Minor grammar fix in GPL paragraph by mauricesvp in https://github.com/UKPLab/sentence-transformers/pull/2604
* [feat] Add local_files_only argument to load model from cache by debanjum in https://github.com/UKPLab/sentence-transformers/pull/2603
* Fix broken links by mauricesvp in https://github.com/UKPLab/sentence-transformers/pull/2611
* Updated urls for msmarco dataset by j-dominguez9 in https://github.com/UKPLab/sentence-transformers/pull/2609
* [`v3`] Training refactor - MultiGPU, loss logging, bf16, etc. by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2449
* [`v3`] Add `similarity` and `similarity_pairwise` methods to Sentence Transformers by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2615
* [`v3`] Fix various model card errors by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2616
* [`v3`] Fix trainer `compute_loss` when evaluating/predicting if the `loss` updated the inputs in-place by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2617
* [`v3`] Never return None in infer_datasets, could result in crash by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2620
* [`v3`] Trainer: Implement resume from checkpoint support by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2621
* Fall back to CPU device in case there are no PyTorch parameters by maxfriedrich in https://github.com/UKPLab/sentence-transformers/pull/2614
* Add `trust_remote_code` to `CrossEncoder.tokenizer` by michaelfeil in https://github.com/UKPLab/sentence-transformers/pull/2623
* [`v3`] Update example scripts to the new v3 training format by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2622
* Bug in DenoisingAutoEncoderLoss.py by arun477 in https://github.com/UKPLab/sentence-transformers/pull/2619
* [`v3`] Remove "return_outputs" as it's not strictly necessary. Avoids OOM & speeds up training by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2633
* [`v3`] Fix crash from inferring the dataset_id from a local dataset by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2636
* Enable Sentence Transformer Inference with Intel Gaudi2 GPU Supported ( 'hpu' ) - Follow up for 2557 by ZhengHongming888 in https://github.com/UKPLab/sentence-transformers/pull/2630
* [`v3`] Fix multilingual conversion script; extend MSELoss to multi-column by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2641
* [`v3`] Update evaluation scripts to use HF Datasets by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2642
* Use `b1` quantization for USearch by ashvardanian in https://github.com/UKPLab/sentence-transformers/pull/2644
* [`v3`] Fix `resume_from_checkpoint` by also updating the loss model by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2648
* [`v3`] Fix backwards pass on MSELoss due to in-place update by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2647
* [`v3`] Simplify `load_from_checkpoint` using `load_state_dict` by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2650
* [`v3`] Use `torch.arange` instead of `torch.tensor(range(...))` by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2651
* [`v3`] Resolve inplace modification error in DDP by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2654
* [`v3`] Add hyperparameter optimization support by letting `loss` be a Callable that accepts a `model` by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2655
* [`v3`] Add tag hinting at the number of training samples by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2660
* Allow passing 'precision' when using 'encode_multi_process' to SentenceTransformer by ariel-talent-fabric in https://github.com/UKPLab/sentence-transformers/pull/2659
* Allow passing model_args to ST by satyamk7054 in https://github.com/UKPLab/sentence-transformers/pull/2578
* Fix smart_batching_collate Inefficiency by PrithivirajDamodaran in https://github.com/UKPLab/sentence-transformers/pull/2556
* [`v3`] For the Cached losses; ignore gradients if grad is disabled (e.g. eval) by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2668
* [`docs`] Rewrite the https://sbert.net documentation by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2632
* [`v3`] Chore - include import sorting in ruff by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2672
* [`v3`] Prevent warning with 'model.fit' with transformers >= 4.41.0 due to evaluation_strategy by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2673
* [`v3`] Add various useful Sphinx packages (copy code, link to code, nicer tabs) by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2674
* [`v3`] Make the "primary_metric" for evaluators a bit more robust by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2675
* [`v3`] Set `broadcast_buffers = False` when training with DDP by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2663
* [`v3`] Warn about using DP instead of DDP + set dataloader_drop_last with DDP by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2677
* [`v3`] Add warning that Evaluators only run on 1 GPU when multi-GPU training by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2678
* [`v3`] Move training dependencies into a "train" extra by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2676
* [`v3`] Docs: update references to the API reference by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2679
* [`v3`] Add "dataset_size:" to the tag denoting the number of training samples by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2680
New Contributors
* mauricesvp made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2604
* debanjum made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2603
* j-dominguez9 made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2609
* michaelfeil made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2623
* arun477 made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2619
* ashvardanian made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2644
* ariel-talent-fabric made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2659
* satyamk7054 made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2578
* PrithivirajDamodaran made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2556
A special shoutout to Jakobhenningjensen, smerrill, b5y, ScottishFold007, pszemraj, bwanglzu, igorkurinnyi, for experimenting with the v3.0 release prior to release and matthewfranglen for the initial work on the training refactor back in October of 2022 in 1733.
cc AlexJonesNLP as I know you are interested in this release!
**Full Changelog**: https://github.com/UKPLab/sentence-transformers/compare/v2.7.0...v3.0.0