Sentence-transformers

Latest version: v4.0.2

Safety actively analyzes 724393 Python packages for vulnerabilities to keep your Python projects secure.

Page 16 of 25

1.0.0

Not secure

This release brings many new improvements and new features. Also, the version number scheme is updated. Now we use the format x.y.z with x: for major releases, y: smaller releases with new features, z: bugfixes

Text-Image-Model CLIP
You can now encode text and images in the same vector space using the OpenAI CLIP Model. You can use the model like this:
python
from sentence_transformers import SentenceTransformer, util
from PIL import Image

Load CLIP model
model = SentenceTransformer('clip-ViT-B-32')

Encode an image:
img_emb = model.encode(Image.open('two_dogs_in_snow.jpg'))

Encode text descriptions
text_emb = model.encode(['Two dogs in the snow', 'A cat on a table', 'A picture of London at night'])

Compute cosine similarities
cos_scores = util.cos_sim(img_emb, text_emb)
print(cos_scores)

[More Information](https://www.sbert.net/examples/applications/image-search/README.html)
[IPython Demo](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/image-search/Image_Search.ipynb)
[Colab Demo](https://colab.research.google.com/drive/16OdADinjAg3w3ceZy3-cOR9A-5ZW9BYr#scrollTo=xTFNbzmG3erx)

Examples how to train the CLIP model on your data will be added soon.

New Models
- Add v3 models trained for semantic search on MS MARCO: [MS MARCO Models v3](https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/msmarco-v3.md)
- First models trained on Natural Questions dataset for Q&A Retrieval: [Natural Questions Models v1](https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/nq-v1.md)
- Add DPR Models from Facebook for Q&A Retrieval: [DPR-Models](https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/dpr.md)

New Features
- The [Asym Model](https://github.com/UKPLab/sentence-transformers/releases/tag/v0.4.1) can now be used as the first model in a SentenceTransformer modules list.
- Sorting when encoding changes: Previously, we encoded from short to long sentences. Now we encode from long to short sentences. Out-of-memory errors will then happen at the start. Also the approximation on the duration of the encode process is more precise
- Improvement of the util.semantic_search method: It now uses the much faster torch.topk function. Further, you can define which scoring function should be used
- New util methods: `util.dot_score` computes the dot product of two embedding matrices. `util.normalize_embeddings` will normalize embeddings to unit length
- New parameter for `SentenceTransformer.encode` method: `normalize_embeddings` if set to true, it will normalize embeddings to unit length. In that case the faster `util.dot_score` can be used instead of `util.cos_sim` to compute cosine similarity scores.
- If you specify in `models.Transformer(do_lower_case=True)` when creating a new SentenceTransformer, then all input will be lower cased.

New Examples
- Add example for model quantization on CPUs (smaller models, faster run-time): [model_quantization.py](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/distillation/model_quantization.py)
- Start to add example how to train SBERT models without training data: [unsupervised learning](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning). We start with an example for [Query Generation](https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/query_generation) to train a semantic search model.

Bugfixes
- Encode method now correctly returns token_embeddings if `output_value='token_embeddings'` is defined
- Bugfix of the `LabelAccuracyEvaluator`
- Bugfix of removing tensors off the CPU if you specified `encode(sent, convert_to_tensor=True)`. They now stay on the GPU

Breaking changes:
- SentenceTransformer.encode-Methode: Removed depcreated parameters is_pretokenized and num_workers

0.8342190421330611

0.8312754408857808

0.8260094846238505

0.8244338431442343

All changes
* Add 'precision' support to the EmbeddingSimilarityEvaluator by tomaarsen in 2559
* [hotfix] Quantization patch; fix semantic_search_faiss/semantic_search_usearch rescoring by tomaarsen in 2558
* Fix a typo in a docstring in CosineSimilarityLoss.py by bryant1410 in 2553

**Full Changelog**: https://github.com/UKPLab/sentence-transformers/compare/v2.6.0...v2.6.1

0.8084508771660436

</details>

* API Reference: [`NanoBEIREvaluator`](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#nanobeirevaluator)

PEFT compatibility (https://github.com/UKPLab/sentence-transformers/pull/3000, https://github.com/UKPLab/sentence-transformers/pull/2980, https://github.com/UKPLab/sentence-transformers/pull/3046)
Sentence Transformers has been integrated much more closely with PEFT. Notably, we introduce new methods:
* [active_adapters](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.active_adapters)
* [add_adapter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.add_adapter)
* [disable_adapters](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.disable_adapters)
* [enable_adapters](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.enable_adapters)
* [get_adapter_state_dict](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.get_adapter_state_dict)
* [load_adapter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.load_adapter)
* [set_adapter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.set_adapter)

These methods allow you to add new PEFT adapters or load pretrained ones, for example:

Adding a adapter
python
from sentence_transformers import SentenceTransformer

1. Load a model to finetune with 2. (Optional) model card data
model = SentenceTransformer(
"all-MiniLM-L6-v2",
model_card_data=SentenceTransformerModelCardData(
language="en",
license="apache-2.0",
model_name="all-MiniLM-L6-v2 adapter finetuned on GooAQ pairs",
),
)

2. Create a LoRA adapter for the model & add it
peft_config = LoraConfig(
task_type=TaskType.FEATURE_EXTRACTION,
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.1,
)
model.add_adapter(peft_config)

Proceed as usual... See https://sbert.net/docs/sentence_transformer/training_overview.html

Loading a pretrained adapter
Given [sentence-transformers-testing/stsb-bert-tiny-lora](https://huggingface.co/sentence-transformers-testing/stsb-bert-tiny-lora) as a small adapter model (the `adapter_model.safetensors` file is only 33.8kB!) on top of [sentence-transformers-testing/stsb-bert-tiny-safetensors](https://huggingface.co/sentence-transformers-testing/stsb-bert-tiny-safetensors), you can either load this adapter directly:
python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("sentence-transformers-testing/stsb-bert-tiny-lora")
embeddings = model.encode(["This is an example sentence", "Each sentence is converted"])
print(embeddings.shape)
(2, 128)

Or you can load the original model and load the adapter into it:
python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("sentence-transformers-testing/stsb-bert-tiny-safetensors")
model.load_adapter("sentence-transformers-testing/stsb-bert-tiny-lora")
embeddings = model.encode(["This is an example sentence", "Each sentence is converted"])
print(embeddings.shape)
(2, 128)

Transformers v4.46.0 compatibility (https://github.com/UKPLab/sentence-transformers/pull/3026, https://github.com/UKPLab/sentence-transformers/pull/3035, https://github.com/UKPLab/sentence-transformers/pull/3037, https://github.com/UKPLab/sentence-transformers/pull/3038)
The recent `transformers` v4.46.0 update introduced a few changes that were incompatible with Sentence Transformers. For example:
* Use "processing_class" argument instead of "tokenizers"
* Add a `num_items_in_batch` argument to the `compute_loss` method in the Trainer
* Adding a `ValueError` if `eval_dataset` is None while `eval_strategy` is not `"no"` (this should be possible in Sentence Transformers, as we accept evaluating with just an `evaluator` as well)

These issues and deprecation warnings have been resolved.

Drop Python 3.8 support (https://github.com/UKPLab/sentence-transformers/pull/3033)
Given that Python 3.8 has now reached it's end of life, Sentence Transformers will no longer support it.

All Changes
* [`peft`] If AutoModel is wrapped with PEFT for prompt learning, then extend the attention mask by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3000
* [`integration`] Add support for Transformers v4.46.0 by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3026
* add an ImportError to tell the user that `datasets` must be install to fit a model by h4c5 in https://github.com/UKPLab/sentence-transformers/pull/3020
* [`feat`] Integrate NanoBeIR datasets; use `model.similarity` by default in evaluators by ArthurCamara in https://github.com/UKPLab/sentence-transformers/pull/2966
* Fix model name typo in example by programmer-ke in https://github.com/UKPLab/sentence-transformers/pull/3028
* Support OpenVINO int8 static quantization by l-bat in https://github.com/UKPLab/sentence-transformers/pull/3025
* [`fix`] Avoid passing eval_dataset=None to transformers due to >=v4.46.0 crash by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3035
* [`docs`] Update the dated example in the NanoBEIREvaluator by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3034
* [`deprecate`] Drop Python 3.8 support due to EOL by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3033
* [`tests`] Remove evaluation_steps from model.fit test without evaluator by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3037
* [`fix`] Fix loading pre-exported OV/ONNX model if export=False by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3036
* [`chore`] If Transformers 4.46.0, use processing_class instead of tokenizer when saving by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3038
* [`docs`] Add some missing docs for include_prompt in Pooling by tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3042
* [`feat`] Trainer with prompts and prompt masking by ArthurCamara in https://github.com/UKPLab/sentence-transformers/pull/2964
* [fix] Fix model loading inconsistency after Peft training by using PeftModel by pesuchin in https://github.com/UKPLab/sentence-transformers/pull/2980
* [`enh`] Add Support for multiple adapters on Transformers-based models by carlesonielfa in https://github.com/UKPLab/sentence-transformers/pull/3046 & https://github.com/UKPLab/sentence-transformers/pull/2993
* Moved Model Card Callback init in Trainer to a separate function by tRosenflanz in https://github.com/UKPLab/sentence-transformers/pull/3047

New Contributors
* h4c5 made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3020
* programmer-ke made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3028
* l-bat made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3025
* carlesonielfa made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3046
* tRosenflanz made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3047

Special Thanks
Big thanks to ArthurCamara for leading the work on both 1) training with prompts and 2) NanoBEIR.

**Full Changelog**: https://github.com/UKPLab/sentence-transformers/compare/v3.2.1...v3.3.0

Page 16 of 25

Releases

Has known vulnerabilities

Previous Next

Sentence-transformers

Page 16 of 25

1.0.0

0.8342190421330611

0.8312754408857808

0.8260094846238505

0.8244338431442343

0.8084508771660436

Page 16 of 25

Links

Releases