Bitsandbytes

Latest version: v0.44.1

Safety actively analyzes 681844 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 6

0.39.0

Features:

- 4-bit matrix multiplication for Float4 and NormalFloat4 data types.
- Added 4-bit quantization routines
- Doubled quantization routines for 4-bit quantization
- Paged optimizers for Adam and Lion.
- bfloat16 gradient / weight support for Adam and Lion with 8 or 32-bit states.

Bug fixes:

- Fixed a bug where 8-bit models consumed twice the memory as expected after serialization

Deprecated:

- Kepler binaries (GTX 700s and Tesla K40/K80) are not longer provided via pip and need to be compiled from source. Kepler support might be fully removed in the future.

0.38.1

Features:

- Added Int8 SwitchBack layers
- Added Fake FP8 layers for research purposes (available under `bnb.research.nn. ...`)

0.38.0

8-bit Lion, Load/Store 8-bit Models directly from/to HF Hub

Features:

- Support for 32 and 8-bit Lion has been added. Thank you lucidrains
- Support for serialization of Linear8bitLt layers (LLM.int8()). This allows to store and load 8-bit weights directly from the HuggingFace Hub. Thank you myrab
- New bug report features `python -m bitsandbytes` now gives extensive debugging details to debug CUDA setup failures.

Bug fixes:

- Fixed a bug where some bitsandbytes methods failed in a model-parallel setup on multiple GPUs. Thank you tonylins
- Fixed a bug where cudart.so libraries could not be found in newer PyTorch releases.

Improvements:

- Improved the CUDA Setup procedure by doing a more extensive search for CUDA libraries

Deprecated:

- Devices with compute capability 3.0 (GTX 700s, K10) and 3.2 (Tegra K1, Jetson TK1) are now deprecated and support will be removed in 0.39.0.
- Support for CUDA 10.0 and 10.2 will be removed in bitsandbytes 0.39.0

0.37.0

Int8 Matmul + backward support for all GPUs

Features:

- Int8 MatmulLt now supports backward through inversion of the ColTuring/ColAmpere format. Slow, but memory efficient. Big thanks to borzunov
- Int8 now supported on all GPUs. On devices with compute capability \< 7.5, the Int weights are cast to 16/32-bit for the matrix multiplication. Contributed by borzunov

Improvements:

- Improved logging for the CUDA detection mechanism.

0.36.0

Improvements, Ada/Hopper support, fake k-bit quantization.

Features:

- CUDA 11.8 and 12.0 support added
- support for Ada and Hopper GPUs added (compute capability 8.9 and 9.0)
- support for fake k-bit block-wise quantization for Int, Float, quantile quantization, and dynamic exponent data types added
- Added CUDA instruction generator to fix some installations.
- Added additional block sizes for quantization {64, 128, 256, 512, 1024}
- Added SRAM Quantile algorithm to quickly estimate less than 256 quantiles
- Added option to suppress the bitsandbytes welcome message (Cyberes)

Regression:

- Compute capability 3.0 removed: GTX 600s and 700s series is no longer supported (except GTX 780 and GTX 780 Ti)

Bug fixes:

- fixed a bug where too long directory names would crash the CUDA SETUP 35 (tomaarsen)
- fixed a bug where CPU installations on Colab would run into an error 34 (tomaarsen)
- fixed an issue where the default CUDA version with fast-DreamBooth was not supported 52
- fixed a bug where the CUDA setup failed due to a wrong function call.
- fixed a bug in the CUDA Setup which led to an incomprehensible error if no GPU was detected.
- fixed a bug in the CUDA Setup failed with the cuda runtime was found, but not the cuda library.
- fixed a bug where not finding the cuda runtime led to an incomprehensible error.
- fixed a bug where with missing CUDA the default was an error instead of the loading the CPU library
- fixed a bug where the CC version of the GPU was not detected appropriately (BlackHC)
- fixed a bug in CPU quantization which lead to errors when the input buffer exceeded 2^31 elements

Improvements:

- multiple improvements in formatting, removal of unused imports, and slight performance improvements (tomaarsen)
- StableEmbedding layer now has device and dtype parameters to make it 1:1 replaceable with regular Embedding layers (lostmsu)
- runtime performance of block-wise quantization slightly improved
- added error message for the case multiple libcudart.so are installed and bitsandbytes picks the wrong one

0.35.4

Bug fixes:

- Fixed a bug in the CUDA Setup failed with the cuda runtime was found, but not the cuda library.
- Fixed a bug where not finding the cuda runtime led to an incomprehensible error.

Page 3 of 6

Releases

Has known vulnerabilities

Previous Next

Bitsandbytes

Page 3 of 6

0.39.0

0.38.1

0.38.0

0.37.0

0.36.0

0.35.4

Page 3 of 6

Links

Releases