Sentence-transformers

Latest version: v4.0.1

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 14 of 24

2.0.0

Not secure
Models hosted on the hub
All pre-trained models are now hosted on the [Huggingface Models hub](https://huggingface.co/models).

Our pre-trained models can be found here: [https://huggingface.co/sentence-transformers](https://huggingface.co/sentence-transformers)

But you can easily share your own sentence-transformer model on the hub and have other people easily access it. Simple upload the folder and have people load it via:

model = SentenceTransformer('[your_username]/[model_name]')


For more information, see: [Sentence Transformers in the Hugging Face Hub](https://huggingface.co/blog/sentence-transformers-in-the-hub)

Breaking changes

There should be no breaking changes. Old models can still be loaded from disc. However, if you use one of the provided pre-trained models, it will be downloaded again in version 2 of sentence transformers as the cache path has slightly changed.

Find sentence-transformer models on the Hub

You can filter the hub for sentence-transformers models: [https://huggingface.co/models?filter=sentence-transformers](https://huggingface.co/models?filter=sentence-transformers)

Add the `sentence-transformers` tag to you model card so that others can find your model.

Widget & Inference API
A widget was added to sentence-transformers models on the hub that lets you interact directly on the models website:
https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L6-v2

Further, models can now be used with the [Accelerated Inference API](https://api-inference.huggingface.co/docs/python/html/index.html): Send you sentences to the API and get back the embeddings from the respective model.

Save Model to Hub

A new method was added to the `SentenceTransformer` class: `save_to_hub`.

Provide the model name and the model is saved on the hub.

Here you find the explanation from transformers how the hub works: [Model sharing and uploading
](https://huggingface.co/transformers/model_sharing.html)

Automatic Model Card

When you save a model with `save` or `save_to_hub`, a `README.md` (also known as model card) is automatically generated with basic information about the respective SentenceTransformer model.


New Models
- Several new sentence embedding models have been added, which are much better than the previous model: [Sentence Embedding Models](https://www.sbert.net/docs/pretrained_models.html#sentence-embedding-models)
- Some new models for semantic search based on MS MARCO have been added: [MSMARCO Models](https://www.sbert.net/docs/pretrained-models/msmarco-v3.html)
- The training script for these MS MARCO models have been released as well: [Train MS MARCO Bi-Encoder v3](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/ms_marco/train_bi-encoder-v3.py)

2b.

embeddings = model.encode(["I am driving to the lake.", "It is a beautiful day."])
binary_embeddings = quantize_embeddings(embeddings, precision="binary")


References:
* [SentenceTransformer.encode](https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode)
* [quantize_embeddings](https://sbert.net/docs/package_reference/quantization.html#sentence_transformers.quantization.quantize_embeddings)

GISTEmbedLoss

GISTEmbedLoss, as introduced in [Solatorio (2024)](https://arxiv.org/pdf/2402.16829.pdf), is a guided variant of the more standard in-batch negatives (`MultipleNegativesRankingLoss`) loss. Both loss functions are provided with a list of (anchor, positive) pairs, but while `MultipleNegativesRankingLoss` uses `anchor_i` and `positive_i` as positive pair and all `positive_j` with `i != j` as negative pairs, `GISTEmbedLoss` uses a second model to guide the in-batch negative sample selection.

This can be very useful, because it is plausible that `anchor_i` and `positive_j` are actually quite semantically similar. In this case, `GISTEmbedLoss` would not consider them a negative pair, while `MultipleNegativesRankingLoss` would. When finetuning MPNet-base on the AllNLI dataset, these are the Spearman correlation based on cosine similarity using the STS Benchmark dev set (higher is better):

![312039399-ef5d4042-a739-41f6-a6ca-eddc7f901411](https://github.com/UKPLab/sentence-transformers/assets/37621491/ae99e809-4cc9-4db3-8b00-94cc74d2fe3b)
The blue line is `MultipleNegativesRankingLoss`, whereas the grey line is `GISTEmbedLoss` with the small `all-MiniLM-L6-v2` as the guide model. Note that `all-MiniLM-L6-v2` by itself does not reach 88 Spearman correlation on this dataset, so this is really the effect of two models (`mpnet-base` and `all-MiniLM-L6-v2`) reaching a performance that they could not reach separately.

Soft `save_to_hub` Deprecation
Most codebases that allow for pushing models to the [Hugging Face Hub](https://huggingface.co/) adopt a `push_to_hub` method instead of a `save_to_hub` method, and now Sentence Transformers will follow that convention. The [`push_to_hub`](https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.push_to_hub) method will now be the recommended approach, although `save_to_hub` will continue to exist for the time being: it will simply call `push_to_hub` internally.

python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-mpnet-base-v2")

...

Train the model
model.fit(
train_objectives=[(train_dataloader, train_loss)],
evaluator=dev_evaluator,
epochs=num_epochs,
evaluation_steps=1000,
warmup_steps=warmup_steps,
)

Push the model to Hugging Face
model.push_to_hub("tomaarsen/mpnet-base-nli-stsb")


All changes
* Add GISTEmbedLoss by avsolatorio in https://github.com/UKPLab/sentence-transformers/pull/2535
* [`feat`] Add 'get_config_dict' method to GISTEmbedLoss for better model cards by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2543
* Enable saving modules as pytorch_model.bin by CKeibel in https://github.com/UKPLab/sentence-transformers/pull/2542
* [`deprecation`] Deprecate `save_to_hub` in favor of `push_to_hub`; add safe_serialization support to `push_to_hub` by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2544
* Fix SentenceTransformer encode documentation return type default (numpy vectors) by CKeibel in https://github.com/UKPLab/sentence-transformers/pull/2546
* [`docs`] Update return docstring of encode_multi_process by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2548
* [`feat`] Add binary & scalar embedding quantization support to Sentence Transformers by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/2549

New Contributors
* avsolatorio made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2535
* CKeibel made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/2542

**Full Changelog**: https://github.com/UKPLab/sentence-transformers/compare/v2.5.1...v2.6.0

2a.

binary_embeddings = model.encode(
["I am driving to the lake.", "It is a beautiful day."],
precision="binary",
)

1.3184

[-0.7437, -0.0000, -1.3702, -1.3320],
[-1.3935, -1.3702, -0.0000, -0.9973],
[-1.3184, -1.3320, -0.9973, -0.0000]])

Additionally, you can compute the similarity between pairs of embeddings, resulting in a 1-dimensional vector of similarities rather than a 2-dimensional matrix:
python
>>> model = SentenceTransformer("all-mpnet-base-v2")
>>> sentences = [
... "The weather is so nice!",
... "It's so sunny outside.",
... "He's driving to the movie theater.",
... "She's going to the cinema.",
... ]
>>> embeddings = model.encode(sentences, normalize_embeddings=True)
>>> model.similarity_pairwise(embeddings[::2], embeddings[1::2])

1.2.1

Not secure
Final release of version 1: Makes v1 of sentence-transformers forward compatible with models from version 2 of sentence-transformers.

1.2.0

Not secure
Unsupervised Sentence Embedding Learning

New methods integrated to train sentence embedding models without labeled data. See [Unsupervised Learning](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning) for an overview of all existent methods.

New methods:
- **[CT](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/CT)**: Integration of [Semantic Re-Tuning With Contrastive Tension (CT)](https://openreview.net/pdf?id=Ov_sMNau-PF) to tune models without labeled data
- **[CT_In-Batch_Negatives](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/CT_In-Batch_Negatives)**: A modification of CT using in-batch negatives
- **[SimCSE](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/SimCSE)**: An unsupervised sentence embedding learning method by [Gao et al.](https://arxiv.org/abs/2104.08821)

Pre-Training Methods
- **[MLM](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/MLM):** An example script to run Masked-Language-Modeling (MLM). Running MLM on your custom data before supervised training can significantly improve the performances. Further, MLM also works well for domain trainsfer: You first train on your custom data, and then train with e.g. NLI or STS data.


Training Examples
- **[Paraphrase Data](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/paraphrases):** In our paper [Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation](https://arxiv.org/abs/2004.09813) we have shown that training on paraphrase data is powerful. In that folder we provide collections of different paraphrase datasets and scripts to train on it.
- **[NLI with MultipleNegativeRankingLoss](https://www.sbert.net/examples/training/nli/README.html#multiplenegativesrankingloss)**: A dedicated example how to use MultipleNegativeRankingLoss for training with NLI data, which leads to a significant performance boost.




New models
- **[New NLI & STS models](https://www.sbert.net/docs/pretrained_models.html#semantic-textual-similarity):** Following the [Paraphrase Data training example](https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/paraphrases) we published new models trained on NLI and NLI+STS data. Training code is available: [training_nli_v2.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/nli/training_nli_v2.py).

| Model-Name | STSb-test performance |
| --- | :---: |
| *Previous best models* | |
| nli-bert-large | 79.19 |
| stsb-roberta-large | 86.39 |
| *New v2 models* | |
| nli-mpnet-base-v2 | 86.53 |
| stsb-mpnet-base-v2 | 88.57 |

- **[New MS MARCO model for Semantic Search](https://www.sbert.net/docs/pretrained-models/msmarco-v3.html)**: [Hofstätter et al.](https://arxiv.org/abs/2104.06967) optimized the training procedure on the [MS MARCO dataset](https://www.sbert.net/examples/training/ms_marco/README.html). The resulting model is integrated as **msmarco-distilbert-base-tas-b** and improves the performance on the MS MARCO dataset from 33.13 to 34.43 MRR10

New Functions
- `SentenceTransformer.fit()` **Checkpoints**: The fit() method now allows to save checkpoints during the training at a fixed number of steps. [More info](https://www.sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.fit)
- **Pooling-mode as string**: You can now pass the pooling-mode to `models.Pooling()` as string:
python
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode='mean')

Valid values are mean/max/cls.
- **[NoDuplicatesDataLoader](https://www.sbert.net/docs/package_reference/datasets.html#noduplicatesdataloader)**: When using the [MultipleNegativesRankingLoss](https://www.sbert.net/docs/package_reference/losses.html#multiplenegativesrankingloss), one should avoid to have duplicate sentences in the same sentence. This data loader simplifies this task and ensures that no duplicate entries are in the same batch.~~~~

Page 14 of 24

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.