Bitsandbytes-windows

Latest version: v0.37.5

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 3

0.35.0

CUDA 11.8 support and bug fixes

Features:
- CUDA 11.8 support added and binaries added to the PyPI release.

Bug fixes:
- fixed a bug where too long directory names would crash the CUDA SETUP 35 (thank you tomaarsen)
- fixed a bug where CPU installations on Colab would run into an error 34 (thank you tomaarsen)
- fixed an issue where the default CUDA version with fast-DreamBooth was not supported 52

0.34.0

Bug fixes and memory efficient backprop

Features:
- Linear8bitLt layer now supports `memory_efficient_backward=True` which enables backprop of gradients through frozen weights.

Bug fixes:
- fixed an issue where too many threads were created in blockwise quantization on the CPU for large tensors

0.33.0

Various bug fixes

Features:
- CPU quantization now supports a variable `blocksize` variable to enhance quantization speed or precision.

Bug fixes:
- fixed an issue in CPU quantization where tensors with more than 2^31 elements would fail 19a7adca7a6c9bf7061a384d7e9d9b13676a1a88
- fixed a bug where cpu binaries would fail if no GPU would be detected eab4d8232d558f2e6bd7f7cc3d00e2e6e94f4e80
- fixed an issue where cpu binaries cause additional stdout messages 92a3363096e10ad6a5c4e944af898bd1186d806a
- fixed an import of bnb.utils 2e630b55f51d454f3bd723dffda68a07ef93190c

We thank mryab, mbrukman, chessgecko, dbaranchuk for pull request with bug fixes and new features.

0.32.0

8-bit Inference Performance Enhancements

We added performance enhancements for small models. This makes small models about 2x faster for LLM.int8() inference.

Features:
- Int32 dequantization now supports fused biases.
- Linear8bitLt now uses a fused bias implementation.
- Change `.data.storage().data_ptr()` to `.data.data_ptr()` to enhance inference performance.

Bug fixes:
- Now throws and error if LLM.int8() is used on a GPU that is not supported.
- Enhances error messaging if CUDA SETUP fails.

0.31.0

8-bit Inference and Packaging Update

Features:
- added direct outlier extraction. This enables outlier extraction without fp16 weights without performance degradation.
- Added automatic CUDA SETUP procedure and packaging all binaries into a single bitsandbytes package.

0.30.0

8-bit Inference Update

Features:
- Added 8-bit matrix multiplication form cuBLAS, and cuBLASLt as well as multiple GEMM kernels (GEMM, GEMMEx, GEMMLt)
- Added 8-bit Linear layers with 8-bit Params that perform memory efficient inference with an option for 8-bit mixed precision matrix decomposition for inference without performance degradation
- Added quantization methods for "fake" quantization as well as optimized kernels vector-wise quantization and equalization as well as optimized cuBLASLt transformations
- CPU only build now available (Thank you, mryab)

Deprecated:
- Pre-compiled release for CUDA 9.2, 10.0, 10.2 no longer available

Page 2 of 3

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.