Pylate

Latest version: v1.1.7

Safety actively analyzes 723217 Python packages for vulnerabilities to keep your Python projects secure.

1.1.7

Contrastive learning heavily relies on big batches to have the best possibles negatives samples and do the best learning, we thus introduce two new features to reach those sweet batch sizes:

- Addition of the CachedConstrastive loss, that implements GradCache to scale the batch size without requiring more memory (can be seen as gradient accumulation but for contrastive learning).

- Addition of the cross-GPU gathering option for (Cached)Contrastive loss, in order to leverage the representations computed by the other GPUs in a multi-gpu settings to increase the effective per gpu batch size.

![image](https://github.com/user-attachments/assets/792be6f8-e3bd-41c6-b6ac-7fd1366207cf)

1.1.6

- Addition of NanoBEIREvaluator, allowing to give quick signal about the learning during training.

- Bump of transformers/ST versions, allowing to use ModernBERT in PyLate and fixing an issue when loading models after training them with `trust_remote_code=True.`

- Reading of Stanford-NLP models configurations (markers, attending to expansion tokens, ...), allowing to load models such as Jina-ColBERT with good default parameters.

- Support of Python 3.9.

- Fix 1.1.5 loading stanford model metadata error.

1.1.4

This release aim to extend the compatibility with existing models on the Hugging Face hub and properly loading them.

1.1.3

**PyLate Update**

This release introduces several new features and improvements:

**1. Native Stanford-NLP Model Support**
PyLate now supports loading Stanford-NLP models directly, without requiring manual weight conversion. This includes models like [Jina-ColBERTv2](https://huggingface.co/jinaai/jina-colbert-v2) and local models. Use the model name when creating a PyLate model.

**2. FastAPI Integration**
PyLate now allows serving embeddings via a FastAPI server. The server supports dynamic batch processing to handle multiple requests efficiently. See the [documentation](https://lightonai.github.io/pylate/documentation/fastapi/) for details.

**3. DictDataset Added**
`DictDataset` has been introduced for handling datasets more effectively during training and inference.

**4. Model Card Generation**
Trained models now include a generated Model Card containing metadata about the model and training setup.

**Fixes and Enhancements**
- Fixed an issue where dataset processing during training could become unresponsive.
- Improved performance and reliability for training and inference.

1.0.0

Release of PyLate 1.0.0.

- ColBERT training: constrastive, knowledge distillation.
- ColBERT retrieval, ranking.
- Documentation and Readme.
- Tests.
- Model loading.
- Various features.

Releases

Has known vulnerabilities