Ragatouille

Latest version: v0.0.9

Safety actively analyzes 723217 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

0.0.9

What's Changed
* fix: fix inversion method & pytorch Kmeans OOM by bclavie in https://github.com/AnswerDotAI/RAGatouille/pull/179
* performance: Optimize ColBERT index free search with torch.topk by Diegi97 in https://github.com/AnswerDotAI/RAGatouille/pull/219
* Calculate `pid_docid_map.values()` only once in `add_to_index` by vishalbakshi in https://github.com/AnswerDotAI/RAGatouille/pull/267
* Fix/185 return trainer best checkpoint path by GeraudBourdin in https://github.com/AnswerDotAI/RAGatouille/pull/265
* Finally remove the dependency hell by fully getting rid of poetry. Stay tuned for more updates!

New Contributors
* vishalbakshi made their first contribution in https://github.com/AnswerDotAI/RAGatouille/pull/267
* GeraudBourdin made their first contribution in https://github.com/AnswerDotAI/RAGatouille/pull/265

**Full Changelog**: https://github.com/AnswerDotAI/RAGatouille/compare/0.0.8...0.0.9

0.0.8post1

Minor fix: Corrects `from time import time` import introduced in indexing overhaul and causing crashing issues as `time` was then used improperly.

0.0.8

Major changes:

- Indexing overhaul contributed by jlscheerer https://github.com/bclavie/RAGatouille/pull/158
- Relaxed dependencies to ensure lower install load https://github.com/bclavie/RAGatouille/pull/173
- Indexing for under 100k documents will by default no longer use Faiss, performing K-Means in pure PyTorch instead. This is a bit of an experimental change, but benchmark results are encouraging and result in greatly increased compatibility. https://github.com/bclavie/RAGatouille/pull/173
- CRUD improvements by anirudhdharmarajan. Feature is still experimental/not fully supported, but rapidly improving!

Fixes:
- Many small bug fixes, mainly around typing
- Training triplets improvement (already present in 0.0.7 post versions) by JoshuaPurtell

0.0.7post3

- Improvements for data preprocessing issues and fixes for broken training example by jonppe (138) 🙏

0.0.7post2

Fixes & tweaks to the previous release:

- Automatically adjust batch size on longer contexts (32 for 512 tokens, 16 for 1024, 8 for 2048, decreasing like this until a minimum of 1)
- Apply dynamic max context length to reranking

0.0.7post1

Release focusing on length adjustments. Much more dynamism and on-the-fly adaptation, both for query length and maximum document length!

- Remove hardcoded maximum length: it is now inferred from your base model's maximum position encodings. This enables support for longer-context ColBERT, such as [Jina ColBERT](https://huggingface.co/jinaai/jina-colbert-v1-en)
- Upstream changes to `colbert-ai` to allow any base model to be used, rather than pre-defined ones.
- Query length now adjusts dynamically, from 32 (hardcoded minimum) to your model's maximum context window for longer queries.

Page 1 of 3

Releases

Has known vulnerabilities

Ragatouille

Page 1 of 3

0.0.9

0.0.8post1

0.0.8

0.0.7post3

0.0.7post2

0.0.7post1

Page 1 of 3

Links

Releases