Xformers

Latest version: v0.0.28.post3

Safety actively analyzes 688215 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 7

0.0.20

Improved
- fMHA/cutlass (backward): Massive performance improvements when `batch_size * num_heads` is low (10x+)
- fMHA/cutlass: Further performance improvements for both the forward & backward kernels
- fMHA (backward): Now dispatching to cutlass when `embed_dim>64`
- fMHA: Updated Flash-Attention to `v1.0.5`
Added
- fMHA now runs on H100 (support is experimental)

0.0.19

Added
- Display `nvcc` version used to compile `xformers` in `python -m xformers.info`

Fixed
- Fixed performance regression with `nvcc>11.6` (facebookresearch/xformers712)
- fMHA/cutlass: Fixed `nan` in the output when using a `torch.Tensor` with `-inf` prefixes as `attn_bias` (facebookresearch/xformers722)
- fMHA/cutlass: Fixed `nan` in the output when the sequence length is larger than `2 ** 15` (facebookresearch/xformers719)
- fMHA/cutlass: Significative performance improvements (up to 2x) for both the forward pass and backward pass
- fMHA/cutlass: The kernel are now deterministic
- fMHA/cutlass: Fixed backward pass correctness when using dropout (facebookresearch/xformers724)

0.0.18

Added
- Added `xformers.ops.index_select_cat` and `xformers.ops.scaled_index_add` - those are experimental functions that only work with a few shapes, and can be used to write efficient stochastic depth in transformer architectures for instance

Fixed
- fMHA: `memory_efficient_attention` now accepts `torch.Tensor` as attention bias for any seqlen, although there are still requirements on the alignment of the bias tensor (see facebookresearch/xformers683)

0.0.17

Fixed
- fMHA: Fixed BW pass on Sm86/Sm89 GPUs when `K > 64` (RTX 3090, RTX 4090, A6000, ..) [facebookresearch/xformers631]

Added
- fMHA/CUTLASS: Added tensor attn bias support [facebookresearch/xformers587] - contribution from [jfc4050](https://github.com/jfc4050)
- fMHA/CUTLASS: Added tensor attn bias grad support [facebookresearch/xformers587] - contribution from [jfc4050](https://github.com/jfc4050)
- fMHA/CUTLASS: Added dropout support [facebookresearch/xformers587] - contribution from [jfc4050](https://github.com/jfc4050)
- fMHA: Added support for varying sequence lengths [facebookresearch/xformers500]

0.0.16

Fixed
- Updated triton dependency [facebookresearch/xformers418]
- Stripe lineinfo from binaries, reducing the binary size [facebookresearch/xformers549]
- Added support for pip wheels [facebookresearch/xformers588, facebookresearch/xformers573, facebookresearch/xformers534, facebookresearch/xformers523, ...] big thanks to [AbdBarho](https://github.com/AbdBarho)!
- Fixed compatibility with Python 3.7 [facebookresearch/xformers541] - thanks to [susumuota](https://github.com/susumuota)
- fMHA: Fixed strides for QKV gradients for cutlass attention [facebookresearch/xformers535]
- fMHA: Stricter inputs validation to avoid CUDA errors for unsupported inputs [facebookresearch/xformers592]
- fMHA/Flash-Attention: Updated to https://github.com/HazyResearch/flash-attention/commit/a1f49a2b92b6fa022379bbebafed9d7f5e96a675 with multiple changes from [TriDao](https://github.com/tridao) that make the operator up to 20% faster
- fMHA/Flash-Attention: Fixed backward pass wrapper, where non-contiguous gradients could give the wrong result [facebookresearch/xformers548]
- fMHA: Separate each operator into forward and backward operators. It's now possible to use any combination of forward+backward (for instance Triton forward and Flash-Attention backward) [facebookresearch/xformers560]

Added
- fMHA: Added Triton operator for forward pass from [Flash-Attention](https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attn_triton.py) authored by [TriDao](https://github.com/tridao), will be automatically used on A100 when compatible
- fMHA: Added [`xformers.ops.memory_efficient_attention_forward`](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.memory_efficient_attention_forward), [`xformers.ops.memory_efficient_attention_forward_requires_grad`](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.memory_efficient_attention_forward_requires_grad), [`xformers.ops.memory_efficient_attention_backward`](https://facebookresearch.github.io/xformers/components/ops.html#xformers.ops.memory_efficient_attention_backward) for power-users who write custom autograd functions [facebookresearch/xformers560]
- fMHA: Support for custom scaling for the CUTLASS-based kernel [facebookresearch/xformers530] - contribution from [comaniac](https://github.com/comaniac)

0.0.15 Page 5 of 7

Releases

Has known vulnerabilities

Previous Next

Xformers

Page 5 of 7

0.0.20

0.0.19

0.0.18

0.0.17

0.0.16

0.0.15

Page 5 of 7

Links

Releases