Highlights
- UQFF
- FLUX model
- Llama 3.2 Vision model
MSRV
The MSRV of this release is 1.79.0.
What's Changed
* Enable automatic determination of normal loader type by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/742
* Add the `ForwardInputsResult` api by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/745
* Implement Mixture of Quantized Experts (MoQE) by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/747
* Bump quinn-proto from 0.11.6 to 0.11.8 by dependabot in https://github.com/EricLBuehler/mistral.rs/pull/748
* Fix f64-f32 type mismatch for Metal/Accelerate by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/752
* Nicer error when misconfigured PagedAttention input metadata by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/753
* Update deps, support CUDA 12.6 by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/755
* Patch bug when not using PagedAttention by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/759
* Fix `MistralRs` Drop impl in tokio runtime by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/762
* Use nicer Candle Error APIs by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/767
* Support setting seed by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/766
* Fix Metal build error with seed by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/771
* Fix and add checks for no kv cache by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/776
* UQFF: The uniquely powerful quantized file format. by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/770
* Add `Scheduler::running_len` by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/780
* Deduplicate RoPE caches by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/787
* Easier and simpler Rust-side API by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/785
* Add some examples for AnyMoE by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/788
* Rust API for sampling by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/790
* Our first Diffusion model: FLUX by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/758
* Fix build bugs with metal, NSUInteger by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/792
* Support weight tying in Llama 3.2 GGUF models by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/801
* Implement the Llama 3.2 vision models by EricLBuehler in https://github.com/EricLBuehler/mistral.rs/pull/796
**Full Changelog**: https://github.com/EricLBuehler/mistral.rs/compare/v0.3.0...v0.3.1