Mlx

Latest version: v0.21.0

Safety actively analyzes 682387 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 7

0.21.0

Highlights
* Support 3 and 6 bit quantization: [benchmarks](https://github.com/ml-explore/mlx/pull/1613)
* Much faster memory efficient attention for headdim 64, 80: [benchmarks](https://github.com/ml-explore/mlx/pull/1610)
* Much faster sdpa inference kernel for longer sequences: [benchmarks](https://github.com/ml-explore/mlx/pull/1597)

Core
* `contiguous` op (C++ only) + primitive
* Bfs width limit to reduce memory consumption during `eval`
* Fast CPU quantization
* Faster indexing math in several kernels:
* unary, binary, ternary, copy, compiled, reduce
* Improve dispatch threads for a few kernels:
* conv, gemm splitk, custom kernels
* More buffer donation with no-ops to reduce memory use
* Use `CMAKE_OSX_DEPLOYMENT_TARGET` to pick Metal version
* Dispatch Metal bf16 type at runtime when using the JIT

NN
* `nn.AvgPool3d` and `nn.MaxPool3d`
* Support `groups` in `nn.Conv2d`

Bug fixes
* Fix per-example mask + docs in sdpa
* Fix FFT synchronization bug (use dispatch method everywhere)
* Throw for invalid `*fft{2,n}` cases
* Fix OOB access in qmv
* Fix donation in sdpa to reduce memory use
* Allocate safetensors header on the heap to avoid stack overflow
* Fix sibling memory leak
* Fix `view` segfault for scalars input
* Fix concatenate vmap

0.20.0

Highlights
- Even faster GEMMs
- Peaking at 23.89 TFlops on M2 Ultra [benchmarks](https://github.com/ml-explore/mlx/pull/1518)
- BFS graph optimizations
- Over 120tks with Mistral 7B!
- Fast batched QMV/QVM for KV quantized attention [benchmarks](https://github.com/ml-explore/mlx/pull/1564)

Core

- New Features
- `mx.linalg.eigh` and `mx.linalg.eigvalsh`
- `mx.nn.init.sparse`
- 64bit type support for `mx.cumprod`, `mx.cumsum`
- Performance
- Faster long column reductions
- Wired buffer support for large models
- Better Winograd dispatch condition for convs
- Faster scatter/gather
- Faster `mx.random.uniform` and `mx.random.bernoulli`
- Better threadgroup sizes for large arrays
- Misc
- Added Python 3.13 to CI
- C++20 compatibility

Bugfixes
- Fix command encoder synchronization
- Fix `mx.vmap` with gather and constant outputs
- Fix fused sdpa with differing key and value strides
- Support `mx.array.__format__` with spec
- Fix multi output array leak
- Fix RMSNorm weight mismatch error

0.19.3

🚀

0.19.2

🚀🚀

0.19.1

🚀

0.19.0

Highlights
* Speed improvements
* Up to 6x faster CPU indexing [benchmarks](https://github.com/ml-explore/mlx/pull/1450)
* Faster Metal compiled kernels for strided inputs [benchmarks](https://github.com/ml-explore/mlx/pull/1486)
* Faster generation with fused-attention kernel [benchmarks](https://github.com/ml-explore/mlx/pull/1497)
* Gradient for grouped convolutions
* Due to Python 3.8's end-of-life we no longer test with it on CI

Core
* New features
* Gradient for grouped convolutions
* `mx.roll`
* `mx.random.permutation`
* `mx.real` and `mx.imag`
* Performance
* Up to 6x faster CPU indexing [benchmarks](https://github.com/ml-explore/mlx/pull/1450)
* Faster CPU sort [benchmarks](https://github.com/ml-explore/mlx/pull/1453)
* Faster Metal compiled kernels for strided inputs [benchmarks](https://github.com/ml-explore/mlx/pull/1486)
* Faster generation with fused-attention kernel [benchmarks](https://github.com/ml-explore/mlx/pull/1497)
* Bulk eval in safetensors to avoid unnecessary serialization of work
* Misc
* Bump to nanobind 2.2
* Move testing to python 3.9 due to 3.8's end-of-life
* Make the GPU device more thread safe
* Fix the submodule stubs for better IDE support
* CI generated docs that will never be stale

NN
* Add support for grouped 1D convolutions to the nn API
* Add some missing type annotations

Bugfixes
* Fix and speedup row-reduce with few rows
* Fix normalization primitive segfault with unexpected inputs
* Fix complex power on the GPU
* Fix freeing deep unevaluated graphs [details](https://github.com/ml-explore/mlx/pull/1462)
* Fix race with `array::is_available`
* Consistently handle softmax with all `-inf` inputs
* Fix streams in affine quantize
* Fix CPU compile preamble for some linux machines
* Stream safety in CPU compilation
* Fix CPU compile segfault at program shutdown

Page 1 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.