Llm-rs

Latest version: v0.2.15

Safety actively analyzes 638755 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

0.2.3

Added the ability to automatically convert any supported model from the Huggingface Hub via the `AutoConverter`.


Models which were converted this way, can be easily quantized or loaded via the `AutoQuantizer` or `AutoModel` without the need to specifiy the architecture.

0.2.1

The ability to quantize models is now available for every architecture via `quantize`.

0.2.0

- Added support for [Mosaic ML](https://huggingface.co/mosaicml)'s MPT models.
- Added support for [LoRA](https://arxiv.org/abs/2106.09685) adapters for all architectures.

⚠️Caution⚠️
Due to changes in the ggml format old quantized models are not supported anymore!

0.1.1

Added the `tokenize` and `decode` functions to each model, to enable access to the internal tokenizer.

The generation of tokens is now GIL free, meaning other background threads can run at the same time.

0.1.0

Since `llama-rs` was renamed to `llm` and now supports multiple model architectures, this wrapper was also expanded to support the new trait system and library structure.

Supported architectures for now:
- Llama
- GPT2
- GPTJ
- GPT-NeoX
- Bloom

The loader was also reworked and now supports the mmap-able `ggjt`. To support this the `SessionConfig` was expandend with the `prefer_mmap` field.

0.0.2

Page 3 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.